Macro Connections
Transforming data into knowledge.

The way we act, both individually and collectively, depends strongly on the way we see the world. The Macro Connections group focuses on the development of analytical tools that can help improve our understanding of the world's macro structures in all of their complexity. By developing methods to analyze and represent networks—such as the networks connecting countries to the products they export, or historical characters to their peers—Macro Connections research aims to help improve our understanding of the world by putting together the pieces that our scientific disciplines have helped to pull apart.

Research Projects

  • Collective Memory

    Cesar A. Hidalgo, C. Jara-Figueroa, and Amy Yu

    Collective memory is formed from the information that our species imbues in both humans and objects. We encode this information as order within physical systems such as media technologies, which allows us to transmit and preserve collective memory to our posterity. We use the biographies of 11,341 memorable people that comprise the Pantheon dataset to study how changes in the systems used to store information affect the quantity and composition of our species' collective memory. We find that changes in media technology such as the printing press, the industrial revolution, the telegraph, and television mark milestones within the evolution of the composition of our collective memory; composition that is directly affected by the predominant communication technology of the time. We also find that these milestones mark changes in the quantity of information from each period that makes up our collective memory.

  • Data Visualization: The Pixel Factory

    Cesar A. Hidalgo and Macro Connections group

    The rise of computational methods has generated a new natural resource: data. While it's unclear if big data will open up trillion-dollar markets, it is clear that making sense of data isn't easy, and that data visualizations are essential to squeeze meaning out of data. But the capacity to create data visualizations is not widespread; to help develop it we introduce the Pixel Factory, a new initiative focusing on the creation of data-visualization resources and tools in collaboration with corporate members. Our goals are to create software resources for development of online data-visualization platforms that work with any type of data; and to create these resources as a means to learn. The most valuable outcome of this work will not be the software resources produced—incredible as these could be—but the generation of people with the capacity to make these resources.

  • DIVE

    Cesar A. Hidalgo and Kevin Zeng Hu

    The Data Integration and Visualization Engine (DIVE) is a platform for semi-automatically generating web-based, interactive visualizations of structured data sets. DIVE will allow users to quickly and efficiently create visualization engines like the Observatory of Economic Complexity, DataViva, and Pantheon. Three components lie at the core of DIVE: inferring the properties and models underlying arbitrary datasets, mapping these properties to visualizations, and programmatically creating scalable, customizable websites integrating these visualizations.

  • Economic Complexity and Income Inequality

    Cesar A. Hidalgo, Manuel Aristaran, Dominik Hartmann, Cristian Ignacio Jara Figueroa and Miguel Guevara

    The mix of products that a country exports is a known predictor of income and economic growth, but does this product mix also predict income inequality? Here we apply methods from statistics, network science, and economic complexity to a dataset combining more than 50 years of international trade data and income inequality. Our results document a robust and stable relationship between income inequality and the mix of products that a country exports. In addition, we present the PINI index: a measure that relates 773 different types of products to the levels of income inequality in their producer countries. Combining the PINI information with the network of related products allows us to illustrate how changes in a country's industrial structure are accompanied by changes in its level of income inequality.

  • FOLD

    Alexis Hope, Kevin Hu, Joe Goldbeck, Nathalie Huynh, Matthew Carroll, Cesar A. Hidalgo, Ethan Zuckerman

    FOLD is an authoring and publishing platform for creating modular, multimedia stories. Some readers require greater context to understand complex stories. Using FOLD, authors can search for and add "context cards" to their stories. Context cards can contain videos, maps, tweets, music, interactive visualizations, and more. FOLD also allows authors to link stories together by remixing context cards created by other writers.


    Cesar A. Hidalgo, Andrew Lippman, Kevin Zeng Hu and Travis Rich

    An animated GIF is a magical thing. It has the power to compactly convey emotion, empathy, and context in a subtle way that text or emoticons often miss. GIFGIF is a project to combine that magic with quantitative methods. Our goal is to create a tool that lets people explore the world of GIFs by the emotions they evoke, rather than by manually entered tags. A web site with 200,000 users maps the GIFs to an emotion space and lets you peruse them interactively.

  • Immersion

    Deepak Jagdish, Daniel Smilkov and Cesar Hidalgo

    Immersion is a visual data experiment that delivers a fresh perspective of your email inbox. Focusing on a people-centric approach rather than the content of the emails, Immersion brings into view an important personal insight—the network of people you are connected to via email, and how it evolves over the course of many years. Given that this experiment deals with data that is extremely private, it is worthwhile to note that when given secure access to your Gmail inbox (which you can revoke any time), Immersion only uses data from email headers and not a single word of any email's subject or body content.

  • Opus

    Cesar A. Hidalgo and Miguel Guevara

    Opus is an online tool exploring the work and trajectory of scholars. Through a suite of interactive visualizations, Opus help users explore the academic impact of a scholar's publications, discover her network of collaborators, and identify her peers.

  • Pantheon

    Ali Almossawi, Andrew Mao, Defne Gurel, Cesar A. Hidalgo, Kevin Zeng Hu, Deepak Jagdish, Amy Yu, Shahar Ronen and Tiffany Lu

    We were not born with the ability to fly, cure disease, or communicate at long distances, but we were born in a society that endows us with these capacities. These capacities are the result of information that has been generated by humans and that humans have been able to embed in tangible and digital objects. This information is all around us: it's the way in which the atoms in an airplane are arranged or the way in which our cellphones whisper dance instructions to electromagnetic waves. Pantheon is a project celebrating the cultural information that endows our species with these fantastic capacities. To celebrate our global cultural heritage, we are compiling, analyzing, and visualizing datasets that can help us understand the process of global cultural development.

  • Place Pulse

    Phil Salesses, Anthony DeVincenzi, and César A. Hidalgo

    Place Pulse is a website that allows anybody to quickly run a crowdsourced study and interactively visualize the results. It works by taking a complex question, such as "Which place in Boston looks the safest?" and breaking it down into easier-to-answer binary pairs. Internet participants are given two images and asked "Which place looks safer?" From the responses, directed graphs are generated and can be mined, allowing the experimenter to identify interesting patterns in the data and form new hypothesis based on their observations. It works with any city or question and is highly scalable. With an increased understanding of human perception, it should be possible for calculated policy decisions to have a disproportionate impact on public opinion.

  • StreetScore

    Nikhil Naik, Jade Philipoom, Ramesh Raskar, Cesar Hidalgo

    StreetScore is a machine learning algorithm that predicts the perceived safety of a streetscape. StreetScore was trained using 2,920 images of streetscapes from New York and Boston and their rankings for perceived safety obtained from a crowdsourced survey. To predict an image's score, StreetScore decomposes this image into features and assigns the image a score based on the associations between features and scores learned from the training dataset. We use StreetScore to create a collection of map visualizations of perceived safety of street views from cities in the United States. StreetScore allows us to scale up the evaluation of streetscapes by several orders of magnitude when compared to a crowdsourced survey. StreetScore can empower research groups working on connecting urban perception with social and economic outcomes by providing high resolution data on urban perception.

  • The Economic Complexity Observatory

    Alex Simoes and César A. Hidalgo

    With more than six billion people and 15 billion products, the world economy is anything but simple. The Economic Complexity Observatory is an online tool that helps people explore this complexity by providing tools that can allow decision makers to understand the connections that exist between countries and the myriad of products they produce and/or export. The Economic Complexity Observatory puts at everyone's fingertips the latest analytical tools developed to visualize and quantify the productive structure of countries and their evolution.

  • The Language Group Network

    Shahar Ronen, Kevin Hu, Michael Xu, and César A. Hidalgo

    Most interactions between cultures require overcoming a language barrier, which is why multilingual speakers play an important role in facilitating such interactions. In addition, certain languages–not necessarily the most spoken ones–are more likely than others to serve as intermediary languages. We present the Language Group Network, a new approach for studying global networks using data generated by tens of millions of speakers from all over the world: a billion tweets, Wikipedia edits in all languages, and translations of two million printed books. Our network spans over eighty languages, and can be used to identify the most connected languages and the potential paths through which information diffuses from one culture to another. Applications include promotion of cultural interactions, prediction of trends, and marketing.

  • The Network Impact in Success

    Cesar A. Hidalgo and Miguel Guevara

    Diverse teams of authors are known to generate higher-impact research papers, as measured by their number of citations. But is this because cognitively diverse teams produce higher quality work, which is more likely to get cited and adopted? Or is it because they possess a larger number of social connections through which to distribute their findings? In this project we are mapping the co-authorship networks and the academic diversity of the authors in a large volume of scientific publications to test whether the adoption of papers is explained by cognitive diversity or the size of the network associated with each of these authors. This project will help us understand whether the larger levels of adoption of work generated by diverse groups is the result of higher quality, or better connections.

  • The Privacy Bounds of Human Mobility

    Cesar A. Hidalgo and Yves-Alexandre DeMontjoye

    We used 15 months of data from 1.5 million people to show that four points—approximate places and times—are enough to identify 95 percent of individuals in a mobility database. Our work shows that human behavior puts fundamental natural constraints on the privacy of individuals, and these constraints hold even when the resolution of the dataset is low. These results demonstrate that even coarse datasets provide little anonymity. We further developed a formula to estimate the uniqueness of human mobility traces. These findings have important implications for the design of frameworks and institutions dedicated to protecting the privacy of individuals.