How Riddance does research
Please steal our methods.
Riddance is a small team of creators, writers, and analysts who specialize in media literacy and AI detection.We’re all confused by what feels like an insurmountable flood of AI-generated media. So, rather than calling ourselves “experts” and hiding behind an opaque process, we want to enable others with the tools and processes we use.
Even making critical, expert-level judgements or uncovering large and imposing AI-generated networks, we want readers leave with tools they can use in their own lives, using many of same methods that we use. This isn’t the primary goal of most news outlets, and it’s pretty hard to do.
This article will explain exactly what we do and how we do it. We’ll focus primarily on videos. It’s tedious to say “video, photo, audio, and text” all the time. But similar techniques may exist in photos, audio, and text, even if we don’t call them out specifically.
Media Analysis Methods
We’re proud of our in-depth knowledge of video and audio technology. It’s an edge we have from years of practice. Regular people (that’s not an insult - be glad you can watch a movie in peace) can learn the basics, then go more in-depth when they want to.
From understanding how real cameras and microphones work, to keeping up with the state-of-the-art in editing techniques and AI generation, we know what to look and listen for to distinguish real from fake media. And yet, media never exists in a vacuum. Its chain of custody will tell you where it came from, why someone shared it, and what parts of the process humans were directly responsible for. That’s why the video and the context are considered simultaneously.
Video analysis
During video analysis, we search for evidence of real video, edits, AI-generation, or all of them at once.
Real videos originate from a physical space, captured through a lens and sensor, then are recorded onto a film or digital format. Film, though less common today, is special because it shares a direct, physical relationship to the light that exposed it. Digital cameras approximate that physical interaction by recording videos into bits through sensors, and they can do it incredibly well. Real sounds originate from a physical space, too, and they can be captured with microphones and onto a recording devices. These physical processes and the conversions have “fingerprints.” People have always tried to get rid of these fingerprints because they can be distracting, but that’s almost impossible to do.
In real videos, some examples of “fingerprints” are lens flares, sensor noise, focus pulling, and sensor stabilization. Less obvious are the parallax and light interactions that exist in the physical space, and how they relate to a camera’s physical vantage point. This was taken for granted for a long time, but AI generated videos don’t fundamentally have these characteristics - they just “learn” what they look like.
Modern AI video models use algorithms to create pixels. These pixels aren’t formed by a complex relationship between physical capture, encoding, and editing, but by a full-stack synthesis of the video itself. This is typically done with a process called diffusion. When AI-generated videos create strange perspectives, breaks in physics, or just obviously mess up what real life looks like, these aren’t just “mistakes”. They’re “fingerprints” of the algorithmic process.
Editing analysis
Videos have been edited for as long as they have existed. Editing isn’t usually deceptive; it’s essential in most video formats. Whether it’s a simple cut between scenes, a lightweight text caption added on top, a filter applied to smooth out the skin, or complex graphics compositing, we separate out original media from the editing techniques.
There are infinite combinations of techniques that make this much more complex than just saying “That’s real! Case Closed!” Animated films and synthesized instruments have been an intended “gotcha” point of many “this is real” debates in the age of AI-generated media, but we track human touch and transparency more than we care about the “authenticity” of specific mediums.
So, by analyzing the source material and editing techniques, we can distinguish between an AI-generated influencer and a real person using a touch-up filter. An AI influencer is objectively not real because they have no physical presence. The real person, captured by a lens and camera, then edited with a skin filter, is still a real person. Are they being their fullest, realest self? That’s not important in our judgement of if they’re real, though it may be a useful data point in other ways.
Metadata & AI detectors
We’re usually viewing media through social media platforms, and they don’t usually make metadata available. Metadata is hidden data in a file. It can contain very useful information, often down to the time of day and exact location where the video was shot. We jump at opportunities to get original files. Otherwise, we can download videos directly from the platforms to get more generic metadata like a video’s aspect ratio, frame rate, video codec, and more.
We run suspected AI photos through Google’s SynthID using Google Gemini. Otherwise, we rarely use AI detectors for more than a small data point. We don’t recommend AI detectors for most social media use cases, because there are so many edits and processes that are common in social media, and many AI detectors can’t reliably detect those. If we’re feeling confident about a specific AI method used and are familiar with an AI detector that is proven to detect that method, we’re more likely to use it. If we suspect that the AI method used is state-of-the-art we won’t use AI detectors at all.
Source identification
When determining authenticity, finding the source is just as important as analyzing the video itself. We look at what the source’s patterns and techniques are. This necessarily means that we authenticate many other videos, photos, and even text put out by the source, to get the big picture about what they do and why. We look for signs of where humans or machines are involved in the process.
Page analysis
Always take what the social media platforms give you! For example, on the front end of Instagram, you can view an account’s age, number of previous handles, and account location (if the user allows it). Most AI-focused accounts were created within the past year, or have a previous handle as they re-branded for their new AI brand. On the other hand, a real creator who has been posting since 2012 is pretty safe. Of course, real creators experiment with AI generation, and real people are starting social media pages every day, too.
Platforms have different transparency levels. Our full guide is elsewhere on Riddance, but in short:
Facebook, Instagram, and YouTube have easy places to find account age and location
TikTok doesn’t publicly show account age and location, but APIs make this information available.
X is always changing
It’s often as simple as scrolling back to their first public post, looking for location tags, and doing some basic browsing.
Linguistic and behavioral analysis
A huge amount of information is stored in the text descriptions and copy. Do the language, idioms, and vocabulary align with their claimed identity? Do strings of text show up on other automated channels, or are they unique? Does the text show signs of AI-generation? These help determine what the entire content workflow looks like, as automated channels usually try to automate as much as possible.
Some of us were doing source analysis with just Tweets, long before AI videos were a concern. Relative to looking for bot farms on Twitter, videos or photos have so much information to work with. How often are they posting? Do their posts align with other similar posts from automated accounts? Is their posting time aligning with their expected time zone? There are so many other places to dig in.
Website information
If they have a website, we’ll do a DNS look up to see who owns the site, when it was registered, and what other information we can gather there. A website itself has a ton of information. A simple “link in bio” can link you to other pages that a creator forgot to monitor.
AI video benchmarking
Keeping an eye on what’s happening with AI by following others is not usually enough. We run tests frequently.
If there’s a video in question that we think could be AI-generated, we’ll try to replicate it. We’ll approximate the model(s) that could have been used and try to get a similar-looking video. Since AI models are random to varying degrees, you’ll never get an exact match. But seeing similar types of artifacts, subjects, and styles is really helpful.
We regularly benchmark new models as they’re released. We use API services like fal.ai to access a ton of models at once and only pay for what we need. We’re working on ways to give more people access to this testing. Stay tuned.
Trend Analysis Techniques
We’re often collecting data and analyzing trends to judge the impact of AI media. This means browsing social media platforms to collect information. How we collect that information is important.
Everything is manual
While it would be really efficient and make our jobs way easier, we avoid automation tools for browsing social media accounts. There are three reasons for this:
We want human judgment at every step. We’re assessing automation and AI technologies that can’t be reliably detected by machines.
Automation changes algorithms. For example, TikTok’s recommendation algorithm treats you differently if you scroll through a video exactly at its conclusion, versus scrolling when you just lose interest. In rare cases, we need machine-like precision (for example, if we don’t want view duration to sway recommendations), but we’ll still try to find analog methods rather than using a bot.
Automation tools are against the terms of service for most social media platforms. Humans need to click the buttons.
Riddance investigation tools
We have some proprietary software. First is a custom browser extension that helps track sessions, video metrics, and tie video data to account-level data. It helps us store data in a way that is persistent, auditable, and shareable. We store videos across different platforms in a database, tying together accounts owned by the same people and keeping a “navigational graph” of a browsing session.
One idea that we think is important is information symmetry. Having a clear picture of the kinds of information a platform gleans from you is a powerful mechanism by which users can take back their autonomy and agency from platforms. The browser extension helps us mirror what the platforms are collecting about us while browsing.
The navigational graph that the session recorder tool spins out is designed to let our analysts retrace their footsteps throughout an investigation. If we find a video title on YouTube and search for that same title on TikTok and it surfaces another video of interest, we can link these two videos. Sometimes you’ll see this graph in our reports, because it’s a great visualization for networks of videos. We preserve this graphical structure in a database to extract for analysis later and to collaborate on investigations.
Recording saving media
In addition to saving logs with the browser extension, we may screen record sessions either with a software or hardware recorders. This helps us preserve every bit of information possible. We save video URLS for investigations, then download video files for video editing and archival. This is usually through a command-line interface like yt-dlp, which also allows us to route downloads through proxy services. This is also how we ingest most of the videos needed for video edits. We’re really sensitive about how much media we save and we make sure to credit the media we use.
Creating fresh accounts for browsing
Creating fresh social media accounts is essential. They allow us to browse platforms free of biases from our existing social media accounts and control different variables. For example, creating a new YouTube account from the same IP address as your other accounts may influence the initial YouTube recommendations. This type of tracking goes much deeper than people expect.
There are a few goals we may have when creating new accounts:
Find similar media through targeted searches
“Does this exist somewhere else?”
Whether it’s finding reposts of a single video, finding multiple accounts associated with one actor, or understanding how groups of uncoordinated actors are riding an “algorithmic wave,” we use techniques like quote searching a string or following hashtags. Sometimes it’s overkill to do this on a fresh account, but existing account data can sway your results.
Find similar media through by recommendation algorithms
“How much more of this exists?”
What are people seeing after they engage with a video or photo, but don’t seek it out explicitly? We want to see what recommendation algorithms will show them. We do this by “seeding” a fresh account in an isolated environment. After creating the account, we engage with a specific type of content, hoping an algorithm picks up on what we’re engaging with and “reacts” to it.
Understand what others are seeing
What are people seeing in general? For example, on a fresh account with no follows, what are the first 50 posts you get? How does that change when you provide certain collection of expressed interests and follow specific accounts?
Investigations are full these types of searches, which also yield a lot of AI-generated media to teach with.
Creating the accounts
Web traffic
We send web traffic through residential proxies, which is a fancy way of saying “we make it look like we’re using the internet at someone else’s house.” There are tons of proxy services out there, and we don’t recommend one over another, but having one that hasn’t yet been overused by scraping services will make everything go smoothly later. Occasionally we’ll pick proxy locations outside of the US, and we’ll cycle through different locations in general.
While using a popular VPN might be tempting, social platforms recognize most of their addresses and either block them or make it a big pain to use them. So, while using a VPN and a Proxy together provides good backup in case a proxy should fail, just the proxy is necessary. We usually route proxy traffic through a Chrome extension on Desktop and route all of the phone’s traffic when using a phone.
Unfortunately, being more specific presents some security and privacy issues. We also don’t want to give cybersecurity recommendations - that’s not our field.
Devices and accounts
We keep burner phone numbers, create new email accounts, and use new names and account information every time. While using individual Google Chrome profiles per session and tying proxy sessions to each profile is usually enough, occasionally we’ll use entirely new devices or virtual machines. Our methods change for the situation because we’re never doing anything illegal, we just want fresh accounts with no ties between them.
Everything evolves
Our methods are always changing with the content, platforms, and tools available. We hope this article gave you a good idea of where to start doing your own investigating. Check back for updates occasionally. We’ll create a change long at the bottom when appropriate.




