Understanding the socio-economic structure of a city is essential for effective public policy, urban planning, and social research. However, traditional approaches often rely on discrete categories or manually constructed indices that fail to capture the full complexity of reality.

At AI-Sunrise, we approached this challenge from a different angle.

Intelligent socio-economic map

A Hidden Structure in Census Data

We analyzed census data from Buenos Aires, where each household answers 79 different questions covering education, housing, employment, access to services, and more.

A key insight emerged early on:

These answers are not independent or random — they are strongly correlated through a hidden confounding variable.

Because the questions are predominantly socio-economic in nature, we hypothesized that this latent variable corresponds to socio-economic level.

Inferring a Latent Socio-Economic Variable

Rather than defining arbitrary rules or thresholds, we used unsupervised machine learning, specifically matrix factorization techniques, to infer this latent variable directly from the data.

This approach allowed us to:

The result is an intelligent socio-economic index, learned entirely from the structure of the data itself.

A Continuous Socio-Economic Heat Map

Using the inferred latent variable, we produced the first continuous socio-economic heat map for Buenos Aires — and later extended the method to other Argentine cities.

Unlike traditional indices:

You can explore the result here: Socio-Economic Map of Buenos Aires

Why This Matters

A continuous, unsupervised socio-economic index provides crucial leverage for many applications:

Because the index is inferred rather than imposed, it adapts naturally to the data and reveals patterns that would otherwise remain hidden.

From Data to Insight

This project exemplifies our approach at AI-Sunrise: using rigorous statistical modeling and machine learning to transform complex public data into actionable, high-impact insight.

When latent structure exists, we don’t guess — we infer it.