Google released an AI tool to help interpret genomic data


This article is produced by NetEase Smart Studio (public number smartman 163). Focus on AI and read the next big era!

[NetEase Smart News, December 10] If you've ever watched a crime show, you might imagine a scene where forensic experts use computers to match DNA fragments from a crime scene with suspects. While real-life procedures aren't as dramatic as on TV, the core idea remains similar. Genetics is fundamentally a science of comparison—whether identifying a suspect, diagnosing a genetic disorder, or finding a long-lost relative, scientists compare genomes to detect similarities and differences among billions of DNA bases.

Although identifying missing persons or suspects usually involves only a few genetic markers, analyzing gene mutations in diseases often requires processing vast amounts of data. Despite ongoing research efforts to help scientists handle this, fully interpreting all the data remains a huge challenge. This is exactly where artificial intelligence comes into play.

This week, Google introduced DeepVariant, a program that uses deep learning to reconstruct a person’s genome and more accurately identify DNA sequence mutations. The technology was originally developed for image recognition tasks, such as distinguishing between cats and dogs. Now, it's being applied to a critical problem in DNA analysis.

Modern DNA sequencers generate high-throughput reads, but these are typically short, overlapping fragments rather than complete sequences. These fragments are then aligned with a reference genome to identify mutations. However, this process is prone to errors, and detecting small mutations can be extremely challenging. These tiny variations are crucial—they can reveal the root causes of diseases, for example. This process is known as “variant calling,” which determines which base pairs are correct and which are not.

There are existing tools that assist with this task, such as GATK, an algorithm designed to account for common sequencing errors. However, DeepVariant takes a different approach by using neural networks, resulting in more accurate mutation detection. Last year, it won first place in an FDA-sponsored competition.

Neural networks mimic the way neurons work in the brain, with each layer handling increasingly complex tasks. To apply image recognition techniques to DNA sequencing, Google transformed genetic data into visual representations. For instance, the four nucleotides—A, T, C, and G—are displayed in different colors. Researchers trained the model on millions of genome sequences, teaching it what patterns to prioritize and what to ignore.

The result is a system that outperforms previous methods in error detection. Initially, the model used three color layers, but the latest version now includes seven, allowing for more precise representation. The software is open source, enabling researchers worldwide to use and improve it further.

While DeepVariant isn’t 100% accurate, its success highlights the growing impact of machine learning on genetics. Genomic data is massive and complex, and machines may just be the key to unlocking its full potential.

(Source: Gizmodo; Compilation: NetEase Smart News Robot; Reviewer: Qin Hao)

Follow the NetEase Smart public account (smartman163) for the latest updates on artificial intelligence and tech innovations.

Transmitting Antenna

Y11 Milk White Shortwave Antenna,Y11 Milk White Omnidirectional Antenna,Y11 Milk White Transmitting Antenna

Mianyang Ouxun Information Industry Co., Ltd , https://www.ouxunantenna.com