Food Dysmorphia

status: WIP / 09-2023 / 06-2024

Food Dysmorphia uses AI technology to empower citizens to critically examine the impact of the industrial food complex through aesthetics and storytelling. It addresses the hidden economic aspects of the food system, revealing the true costs obscured within the complexities of the food industrial complex. The project maps out food inequalities, visually representing externalized costs to enhance understanding. It aims to develop a "reality filter" using AI and computer vision, altering images of fruits and vegetables to reflect their true costs, thus facilitating discussions, actions, and knowledge sharing. The target audience includes individuals interested in sustainability, food systems, and social change, especially those open to deeper exploration of these topics. This is a political project, shedding light on the political dimensions of the food industry and its societal impact.

A Hungry EcoCities project
team: Bernat Cuní + EatThis + KU Leuven Institutes + Brno University of Technology + In4Art

Exploring systemic failures and opportunities through the food’s look & feel

Utilizing generative AI and computer vision technologies, this project will use reality filters to narrate food realities and uncover untold stories, shedding light on system failures and food-related issues

research & references links -> food.cunicode.com/links

initial approach

this project answers the specific research direction:

The eco-modernism / alternative food production systems standoff

In the 2022 documentary The Future of Food eco-modernist Hidde Boersma and advocate for more traditional nature inclusive inspired food production practices Joris Lohman challenges the dichotomies that are between them. Boersma and Lohman represent a large societal standoff between those who believe in tech driven innovation in agriculture (Boersma) and those who reject the influence of modern technology and advocate living in closer harmony with nature by i.e., adhering to the principles of i.e. permaculture (Lohman). The documentary shows how this (Western) standoff is preventing the change in the food system, and that there are things to gain if we listen more to each other. Both ‘camps’ claim they are the true admirers of Nature: the eco-modernist by trying to perfect nature, the perm culturalists by complete submission to nature. It begs the question whether both viewpoints are at odds with each other or if an ecomodernist permaculture or a permacultural eco-modernist garden would be feasible? Could we explore if a hybrid model could have benefits? What would a food forest inside a greenhouse look like? What additional streams/ functionalities can be supported by the greenhouse? Can greenhouses become more nature/ biodiversity enhancing spaces? How could recent biome research in biosciences be introduced into CEA systems? What type of data models would be necessary to support it? But also the opposite direction could be explored: What would adding more control to alternative practices like biological or regenerative farming yield? How would the AI models used in CEA respond to a poly-organized production area?


DEVELOPMENT

below, all the content that I develop for this project, probably in chronological order, but maybe not.

Food? which food?

Contextualising “food” for thid project: include fruits and vegetables, excluding growing and distribution stages, and post-consumer or alternative processing stages

Why How What

Positioning the project

The starting point

“capitalism shapes the food system, and while most of the actors act in good faith, collectively the food system is filled with inequalities, absurdities and abuses, and because the food system is not organised as a system but more of a network of independent actors, those systemic failures are apparently nobodies fault. I believe that what ties it all together is capitalism, which being extractive by nature it rejects any non as-profitable-as-now approach. Thus a fight for a fair food system is an anticapitalist fight.” initial biased assumption

Narratives (possible)

  • Do all technical innovations benefit the profits, not the consumer?

  • is not that we do not know how to make food for all the world, it is that we do not know how to do it while accumulating large amount of profits

  • Preventing something “bad” is more impactful than doing something “good”

  • Item description
  • If food is not affordable or accessible, it perpetuates inequalities


Field Research / technofetishism

Together with other HungryEcoCities members, we did a field trip to Rotterdam to visit the Westlands, the place where horticulture is booming.

At the Westland Museum, we learned about the history (and economics) of greenhouses and how crucial was for Holland to have a rich industrial England to sell expensive grapes to. We visited Koppert Cress and saw how robots (and people) grow weird plants that fit their business model of catering high-end restaurants. We explored the +45 varieties of tomatoes at TomotoWorld and saw how they use bumblebees to pollinate their plants. At the World Horti Center we were presented the techno-marvels that are supposed to keep the Netherlands on top of the food chain. I also went to Amsterdam to meet with Joeri Jansen and discuss behaviour-change & activism from an advertising point of view. There, while at the hipster neighborhood of De Pijp 🙄 I visited De Aanzet a supermarket that presents two prices to the customers, the real one, and the one including the hidden costs, so the customer can decide to pay the fair price or not.

Project KPI’s flow

Visual Classifier / fruits and vegs

Testing different approaches to build an entry touchpoint to FoodDysmorphia.

goal: process images of food (fruits & vegetables) and reject non-food images/scenarios/content

A classifier is needed: local running python = √
Remote: via huggingface or virtual server / or using an API as Google’s AI-Visoin

field research

finetune

learning: when bringing the cost issue into the table, the conversation is not the same anymore (good)

Frame the anticapitalist narrative within the ecosystem, together with solutions, approaches, facts, easthetics -> to not be perceived solely as a naive rant.

conceptual framework

Technical Aesthetics Exploration

Mapping the narratives of the food system

Focus Exercise

“write a press release as if the project is already done”

This helped to frame/visualize/experience the ongoing project concept into something tangible.

Month 1 + 2 / development report

Challenges

At this point, the critical things to decide are.

  • Find the Aesthetics:

    • GenerativeAI is a copy-machine, it fails at “realistic photorealist” (meaning, that is capable of making perfectly beautiful hands but with 7 fingers 🤦‍♂️), and is able to replicate existing visual styles.

Trying to avoid photorealism, a graphics-oriented aesthetics could work.

Propaganda poster aesthetics is explored as a starting point.

flat colors / defined shapes / complementary chromatic schemes

  • Find the Voice

    • who is telling the story? the food? the people in the food system?

    • finding the characters/agents/individuals to tell stories through

stereotypical profiles of food system’s agents

  • Call to Action

    • What do we want to happen? Find a balance between activism / alarmism / solutions

exploring the possible multi-step process to tell a story: upon image submitted, an agent (person) starts the narrative / sets the context / problem exposed / action-solution

Technical context checkup:

Why AI?

  • at scale / reach / framework where things can happen

  • averageness / common denominator / popular culture

Why AR?

  • relatable / personal context 

  • point of reference 


Group Session / Turin Nov’23

2 full time days with presentations & discussions. led by Carlo Ratti Associati team.

I structured my presentation with an intro to my relationship wiht AI tools and how they tend to the mainstreamification of content, and how this relates to the project.

For this project the use of AI is actually related to the averageness, to use a tool capable to communicate/condense concepts in the most transversal and understandable way for many types of audiences.

For each presentation each of the members took notes and added comments to a feedback form that was then shared.

Next steps

From the discussions and feedback the areas of further research are the following

  • this sounds like a tactical media project. (context)

  • potential polarising effects (desired?)

  • how to avoid eyerolling 🙄

  • limit the scope/effect of the output? geographically / sector / company ?

Thoughts

The problem with pointing things out: nobody cares.

I’m subscribed to multiple newsletters and communities around food, and I find myself deleting emails before reading them, same with climate change content, same with other issues… It is not about the information… it is not about the data.


Why Why Why?

When researching a topic, often all roads point to capitalism systemic economic failures. Food is no exception.

so, maybe instead following the thread (positive or negative) we can showcase that path, and communicate a narrative without overwhelming with data or pointing fingers and losing the audience half way… 🤔

Here is this bit by Louic CK, where he explains how kids keep asking “why?” endlessly. is fun, and it is also an Iterative Interrogative Technique, that made me want to try it out.

I instructed a language model to behave in such way, the “5Why” model, with a goal to link any given insight to economic reasoning.

And it works! and the best is that it works for positive and negatives scenarios as:

  • why fruits are so beautiful at the supermarket?” gives a reasoning towards market preferences and how this influences the farmers

  • “why buying local is good?” explaining how this strentens comunities and makes them more resilient within the global economic context

The multi-dimensional map of inequalities

During the by-weekly discussions with the project partners, the question of why/when would someone use this often arises. And linking it with an early conversation with a behavioural media campaign publicist I met in Amsterdam in October, I explored the concept of “giving something to do to the user, a task”

The idea here is that each individual’s actions contributes to a greater result. As analogy, we can use an advent calendar , where each day you unlock something that gets you closer to the goal.

Also, since the project is about showing multiple narratives/realities of the food system that are behind every product, we must find a way to show that variability.

Also, some partners rise concerns that the project is too biased towards demonising corporate practices (which it somehow is), thus a way to tell their story needs to be contemplated.

Mixing all this up, we can now frame the project as a collaborative multi-dimensional exploration of socio economical inequalities in the industrial food system.

A way to explore a complex system with many actors, and many issues.

To do so, I propose looking at the food system as a volume that can be mapped to different axis to show specific intersections of issues. This by nature is very vast as it can contain as much granularity as desired. For prototyping purposes, three main axis are defined, and a forth would be each food that is run through the tool.

In one axis we can have the agents, and in the other the issues, and even a third with the stages of the food system.

In this way, the project has a potential end-goal, which is to map and uncover all the possible scenarios.

To test this approach, the following entries are selected:

This approach allow us to play with the concept of “volume of possibilities”.

Each “agent” needs to get their own “flavour” their voice. Because each agent looks at the issues from their own perspective.

For the final tool, the particularities of each “agent” and their subjectivity (“their story”) will be captured either with interviews and forms, via interface dials or with presets.

Each interaction ends with a call-to-action. Each agent could define a set of actions and the system would dance around them. ⚠️ Safeguarding setup needs to be implemented to prevent greenwashing/foodwashing, because a main goal of the project is to expose the socioeconomical issues in a friendly engaging un unapocalyptical and blameless manner.

Below, a test run with a 🍋, from a Permaculture local grower perspective focusing on local food culture:

Some other tests with 🥑 and 🍅

Technical development

Since mid October I’ve been scouting for a cool developer to help with the implementation,
and I’m happy to have connected with Ruben Gres

In late December we hosted an intensive work session in the studio to draft the architecture of the digital prototype:

The prototype will take the form of a mobile-friendly accessible website. 

Initial development have been made to display a scrollable/draggable endless grid of images on the web.

The backend is running a ComfyUI with Stable Diffusion and a custom web interface collects the outputs.

A workflow to programatically select and mask a given food has also been tested with promising results. 

We still need to solve the following technicalities:

  • integrate LLM generation to drive image creation

  • store and reference text and images

  • combine each generation into compelling videos

Next steps from jan-may 24

  • define the user experience

  • interaction with the “map” as a viewer

  • interaction with the “filter” as a generator user

  • define how to fine-tune each agent’s perspective to the prototype

    • interviews / dials / presets

    • develop and implement those decisions

  • design a workflow to create videos with messages

  • technical test the prototype in the wild (supermarket, home, printed media…)

  • stress-test the prototype for failures


re-conceptualizing

call-to-action -> links

The deep dive in technical development triggers some questions that allow for rethinking the conceptual framework.

For instance, the language model (GPT) has to be invoked several times to:

  • compose the issue (from a set of socioeconomic factors) from a perspective (depending from stakeholder)

  • create a visual description of that issue: to drive the image generation and morphing

  • create a call-to-action to propose solutions to the user.

Using this approach we managed to obtain convincing call-to-actions as:

  • Reduce food deserts: Increase access to fresh foods.

  • Participate in food swaps: Share surplus with neighbors.

  • Demand fair trade avocados: Support small farmers and market competition.

  • Preserve food diversity: Choose heirloom watermelon varieties.

  • Choose community gardens: Cultivate accessible produce with neighbors.

As a proof of concept, a site was setup to collect early generations of text+imgs, but when seeing all generations at once, some emerge repeat and the call too action seem very repetitive, this is becasue the language model does not have memory of what has said previously, and often uses verbs as “choose”, “avoid”, “fight”…

I had to re-think the role call-to-action and experimented with explicit links for the viewer to take action.

The aim is to get recommendations of resources, associations, media related to the presented narrative.

This works, but often the presented links are the same, a high-level approach as “watch food.inc documentary” or “join the sloow-food movement”.

To add granularity to the generation, I try including location to the query, thus creating site-specific receommendations.

In some cases it surfaces very interesting content, as EU policies on food sovereignty or very specific local associations 👍
This is great because it allows us to go from a moment in reality, straight to a very specific piece of knowledge/data/action.

Results seem good, but sometimes the links are invented, as a Language Model is predictive, the most probable way a link starts is with http: and it often ends with .com or the locale variants. thus, the generated links look ok, but may not be real :(

Examples:

Zagreb, Croatia

Lisbon, Portugal

To mitigate this, a session with experts from BRNO University was conducted and a change on prompt-engineering design will be implemented, were instead of asking for links related to a content, we might extract content from links. The approach involves RAG Retrieval Augmentation Generation, and empowers the LLM with eyes to the internet.

multiple perspectives

As the whole project is to showcase the different realities of the food system existing behind every piece of food, we explore how to narrate those from different perspectives.

Initially we thought on having interviews or work-sessions with different stakeholders, but later we considered to explore the existing knowledge within a language model to extract the averaged points of view.

Initial perspectives are:

  • As a wealthy consumer, my purchasing power can influence the food system towards ethical practices by boycotting brands that consolidate power unfairly, mistreat workers, or contribute to unequal food distribution. I support local and cultural foods through patronage and invest in initiatives promoting sustainable economic models in the food industry.

  • As a supermarket chain manager, I prioritize ethical sourcing and fair labor practices to address power consolidation and workers' rights. We're enhancing food distribution to improve accessibility, supporting local economies to reduce reliance, and promoting local foods to preserve cultural variations. Our sector's efforts include partnerships with small producers and community initiatives.

  • As a Permaculture local grower, I champion decentralized food production, enhancing workers' rights through community-based projects and equitable labor practices. We improve food accessibility by fostering direct-to-consumer distribution channels, reducing economic dependence on industrial agriculture. Our approach preserves local food varieties, countering the homogenization driven by power consolidation within the industry.

  • As a CEA industrial grower, we recognize the complexities surrounding power consolidation and strive for equitable industry participation. We prioritize workers' rights, ensuring safe, fair conditions. Our technology improves food distribution/accessibility, reducing economic dependencies on traditional agriculture while preserving local/cultural foods through diverse crop production. We're committed to sustainable practices and solving systemic issues collaboratively.

The generated perspectives could be contrasted/validaded by stakeholders.


Digital prototype - dev

A clear workflow of tools and data is drafted, the prototype will be composed by two clear components:

  • The reality filter: a mobile-first interface to take a photo in context and experience a narrative of the food system through a morphing video

  • The perspective map: a navigational interface to sorting and grouping large amounts of media according to similarities

To experiment and develop the reality filter, I tinker with comfyui, whch is a frontend visual-coding framework to integrate multiple workflows related to text-to-image mainly with Stable Diffusion models.

The good side of this is that with this tool I can chain several image generation with conditional guidance processes to create the necessary scenes to compose the video. And I can also pipe some interpolation processes at the end.

For this workflow, ControlNet is key to carry visual similarities from frame to frame. Initially it has been proposed to use depth estimation, but tests proved to be too strong or too weak in different cases, thus experimenting was needed to find another approach.

Using the TILE model gives great results as it is a model normally used for super-resolution, it carries the visual qualities of the source image in the conditioning.

ControlNET -

Adding Conditional Control to Text-to-Image Diffusion Models

Scenes

Creating new images from starting ones has the following challenge: if the gap is too big, the visual connection is lost, but if the gap is small, we have very few pixel-space to tell a story. So If I want to bring the viewer to a new reality, I need to do it smoothly, and I can achieve this by incrementally decresing teh strength of the initial image in each generation step.

With 4 steps I can already go quite far.

Scene interpolation / video / morph

To craft a smooth visual narrative from the real-world photo to the generated food-system-scene the FILM: [Frame Interpolation for Large Motion] will be used, even if it is quite computationally expensive. There are other approaches, but I feel they break the magic and the mesmerizing effect of seeing something slowly morphing into something else without almost noticing it.

24 in-between-frames seem a good number to create a smooth transition from scene to scene

img2morph -> workflow

videos

samples of the resulting morphs

On the wild / testing

A key advancement has been to find a workflow that allows for experimentation and at the same time use the setup in the real world.

to do this, we use comfyUI as an API, and build a gradio app to be able to use it from a web interface.

Once we have this, we are able to pipe that interface outside the server and use it via a public link, allowing for testing in different devices and locations.

next (technical) steps

*The technical development is done in collaboration with Ruben Gres,. from now on, when I saw "we” I mean Ruben + Bernat


Interface prototype

Sorting

Learning about dimensional reduction, I feel this can be a nice touch to use as navigation for the project.

the aim is to find a way to sort the content (media + text) per affinity, or according to non-strict-direct variables.

One approach is using TSNE T-distributed stochastic neighbor embedding (here a good interactive demo). This allows for sorting/grouping of entities according to their distance (embedding) to each other or to a reference (word, image…)

Some examples using this technique is this GoogleArts project of mapping artworks by similarity

We have been able to plot vegetables according to different axis, using CLIP embeddings

grid / navigation / upload / video generation

For the exploration map, an endless canvas with a grid will be used.

getting the right voice / perspective

Testing the whole text-generation pipeline

food -> agent -> issue -> context -> title -> links -> image representation




This project is developed as part of the Hungry EcoCities S+T+ARTS Residency which has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement 101069990. 
Tags