MIT Solve

Solution overview

Our Solution

Hints.AI

Tagline

Finding misinformation online

Pitch us on your solution

Misinformation (fake news) has pernicious impacts on our communities and our forms of government. Misinformation has been implicated in a wide variety of countries and has been shown to harm civic life. For example, vaccination rates have been impacted by fake new, leading to outbreaks of infectious diseases. While, malign actors have used information to deepen fissures in social society for many years, social media has created increased vulnerability.

Our solution uses network analysis to detect misinformation. Our system is robust, scalable, impervious to adversaries, and offers an explainable machine learning approach to misinformation detection.

In contrast to current approaches which use manual labeling or Natural Language processing, our solution uses network analysis. Who interacts with an article is just as important as what the article states.

Journalists have expressed keen interest and our partners in the media monitoring industry are keen to see us succeed.

What is the problem you are solving?

Misinformation is global in nature, scale, and impact. Everything from elections in modern democracies, trust in civic institutions, vaccinations rates, willingness to adapt new technologies (e.g., 5G) are influenced by misinformation. Most importantly, the underlying social cohesion of modern societies are adversely impacted by adversaries attempting to manipulate societal fissures for their own use.

By stopping the spread of fake news in social networks, we can greatly reduce the impact of misinformation on our societies and ameliorate the impact of misinformation.

Who are you serving?

We are currently working with journalists both in newspapers of record as well as service providers to journalists. We are also working with companies who monitor social media and have started conversations with the social media companies themselves.

What is your solution?

Current methods to detect misinformation mainly rely on source analysis which is easily spoofed or on NLP/semantic analysis which is not sufficiently advanced for the problem (e.g., "Pope endorses Trump" is perfectly valid semantically).

In contrast we rely on network analysis. Constructing a network of articles with weights showing the interactions between them, allows us to discover surprising relationships between disparate types of misinformation.

Formally, we define a graph of articles. Each article is linked to other articles with a weighted edge where the weight depends on the number/type/percentage of common users they share. The more two articles share a specialized audience, the more related they are. Some articles are labeled as True/False using data from the International Fact Checking Network (IFCN) as well as newspapers of record (e.g. NYTimes, Washpost etc) to bootstrap the protocol.

For example, starting off with labeled data which contains misinformation about #Khasoggi leads us to misinformation about Beto O'Rourke illegally funding migrant caravan. Starting off with misinformation about vaccinations leads us to misinformation about crack cocaine (created by the CIA), 911, Qanon etc.

Note that the labeling itself is highly robust. Even a relatively high percentage of mislabels will still result in correct sorting due to the usage of network algorithms to increase our redundancy and power.

The system is robust and extremely quick. It is adversary proof (even if an adversary knows how the algorithm works, they size of the manipulation is bounded by the amount of work they do) and explainable.

Select only the most relevant.

Make government and other institutions more accountable, transparent, and responsive to citizen feedback
Ensure all citizens can overcome barriers to civic participation and inclusion

Where is your solution team headquartered?

Boston, MA

Our solution's stage of development:

Prototype

More about your solution

Select one of the below:

New technology

Describe what makes your solution innovative.

Its a new technology which is fast, robust, adversary proof. The tech has tunable precision and recall which can be arbitrary high at the cost of data collection and speed of data collection. As the number of users interacting with an article increase, the algorithm gets more accurate. Thus, by waiting till more users interact with an article, the result can be more accurate. Note that the number of users needed is still small and any article with insufficient users, does not have enough traction to cause societal damage.

It also solves a known high impact societal problem.

Describe the core technology that your solution utilizes.

We utilize novel algorithms inspired by search algorithms. Our technology is patent pending.

One of the early methods of finding data on the web was Yahoo directory which was a manually annotated list (tree) of all the webpages. This is where we are today with misinformation where we try to manually annotate all of the fake news. This doesn't scale.

The next generation tried to understand the webpage using such techniques as NLP, semantic analysis, TFIDF etc. The technology never worked properly and "Pope endorses Trump" is perfectly valid semantically and requires deep domain knowledge to know that it is false.

The current generation uses networks analysis (e.g., Pagerank) and actualy solves the problem.

It is worth noting that our solution is more robust than pagerank. This is due to the fact that to fool pagerank, you want to increase the score, whereas to fool HINTS you want to decrease the score. This is much harder.

Please select the technologies currently used in your solution:

Artificial Intelligence
Machine Learning
Big Data
Social Networks

Why do you expect your solution to address the problem?

Since we can discover fake news within minutes of it being uploaded, we can provide information to the social media companies to limit it's spread and exposure. This limits the impact of misinformation and the attendant known social harms.

Our solution is already working in a protoype and we are scaling up the engineering to ingest the entire decahose (provided to us by a partner organization).

In which countries do you currently operate?

United States

In which countries will you be operating within the next year?

United States

How many people are you currently serving with your solution? How many will you be serving in one year? How about in five years?

Currently we are in testing. In one year, we intend to be serving 500 journalists and in 5 years to be integrated into all the major social network platforms.

What are your goals within the next year and within the next five years?

Scaling up the engineering to enable full coverage of all the social media data provided to us by our partners.

What are the barriers that currently exist for you to accomplish your goals for the next year and for the next five years?

We have obtained partnerships which provide us with the crucial data. We are currently working on scaling up the engineering to a MVP for journalists who have expressed keen interest.

This MVP will provide a feed of trending misinformation for the journalists to debunk. This will also provide us marketing to enable us to integrate with the social media companies (who are risk averse) to prevent the spread of misinformation.

How are you planning to overcome these barriers?

Our path has been shockingly smooth with everyone trying to help. We are utilizing cloud computing to enable scaleup and our team includes seasoned experts in distributed systems.

If your solution has a website, provide a link here:

http://www.hints.ai

About your team

Your business model & funding

Partnership potential

Why are you applying to Solve?

Exposure. Social cachet especially with the large companies. Funding. Connections.

What types of connections and partnerships would be most catalytic for your solution?

Distribution
Talent or board members
Monitoring and evaluation
Media and speaking opportunities

With what organizations would you like to partner, and how would you like to partner with them?

Facebook, Twitter, Google :)

If you would like to apply for the AI Innovations Prize, describe how you and your team will utilize the prize to advance your solution. If you are not already using AI in your solution, explain why it is necessary for your solution to be successful and how you plan to incorporate it.

Our solution is completely based on novel AI techniques for machine learning with the social network data. We are experts in ML, distributed systems and algorithms.

Funding will go towards hiring more data scientists to accelerate our growth.

If you would like to apply for the Innospark Ventures Prize, describe how you and your team will utilize the prize to advance your solution. If your solution utilizes data, describe how you will ensure that the data is sourced, maintained, and used ethically and responsibly.

Our solution is based on novel network analysis algorithms for social graph data. Funding will be used towards engineering and scale up.

Our current data is the Twitter decahose accessed through our partner Zignal labs. This data is publicly available (for a cost) and thus the data sourcing and maintenance issues are de minimus.

If you would like to apply for the Morgridge Family Foundation Community-Driven Innovation Prize, describe how you and your team will utilize the prize to advance your solution.

Our solutions has the potential to drastically transform the ecosystem for misinformation by preventing the spread of misinformation. This has wide application for a range of issues such as race relations, communal tensions, anti-vaccine issues etc.