Since our inception, we’ve been pitching a vision of Biological Search and of a Biological Atlas. We’ve built our platform to be able to ingest public and private biological data, to operate flexibly on that data, and to search over the entire corpus. And starting this week, we’ve made the Enable Cloud Platform free to academic accounts, allowing for even more users to upload, analyze, and share their biological data.
Meanwhile, recent advancements in Large Language Models have accelerated change in many industries; in Biology, we see an unprecedented opportunity to make tooling, data, and knowledge drastically more accessible.
Today we’re releasing a first step in Generative Biological Search on our growing Atlas, powered by LLMs. In just the last few weeks, we’ve been able to show that we can use LLMs to:
This is undoubtedly a first step of many. Beyond the work shared here, this first step serves as an example of how AI, LLMs, and General Intelligence can solve real problems for scientists.
We classify the work we've done into three main categories:
The Unsupervised Clustering extension allows users to run industry standard algorithms for grouping cells into types. The user must then manually label each cell group with its cell type annotation.
We now leverage LLMs to suggest cluster labels for each group. We can utilize the broad knowledge that LLMs are trained on to identify common cell types and to provide reasoning about why the label was suggested, based on the cluster’s biomarker expression statistics. We are excited about the ability to accelerate a crucial task that once required significant time and biological expertise from researchers.
One of our overarching goals at Enable is to build the world’s largest Biological Atlas. With it, users can access their own data, and augment their work by discovering other relevant datasets.
Using LLMs, we are now able to suggest similar studies for any given study. We hope this serves as a way to streamline data discoverability, and to scale the impact and power of scientists’ work.
The Explorer allows users to visualize comparisons across clinical cohorts. We allow for flexible cohort definitions, enabling powerful analysis but making cohort definition a potentially tedious task.
Users can now define cohorts using natural language. This feature will convert the user description to a structured cohort output that the Explorer uses to render its plots. This can significantly lower the time it takes to analyze your data.
AI is also playing a key role in the future of biological search on our platform. LLM technology has unlocked full free-text semantic search within our Atlas Search, on all levels of our data hierarchy. This is in addition to the more granular search that we already support.
You can now search for something like “human skin samples”, and receive all relevant results, whether they’re classified as “skin”, or “epithelial”, or other related terms. This opens the door for exciting new cross-dataset research and insight generation, as datasets are rarely standardized with the labels and metadata that are provided.
Finally, we’re excited to release the initial version of our Enable research assistant powered by LLMs.
This assistant will be deeply integrated throughout the entire Enable platform, able to assist researchers through their entire research process. Our assistant has:
To ensure that the assistant does not provide inaccurate or invalid responses, it will always wait on user confirmation to actually take actions. Additionally, to keep the scientific process flexible, we built out support for richer interaction patterns - redirecting users throughout the platform, citing sources and linking information, and returning structured answers with multiple suggestions for the user to take.
The assistant can now be accessed from any page in the Enable Portal. In the near future, we’ll continue to improve the assistant, such as providing the assistant with context to target responses. For example, if a user is viewing a specific region, the assistant may provide details about the region and its associated metadata without prompting.
Overall, we’re extremely excited about these new AI-backed features and have already seen how these features can provide true value to scientists. We’re just beginning to see the full impact of this technology, and we expect that tools and models will only continue to grow more powerful and reliable.
However, LLMs today are not yet trusted to make decisions without human supervision, and can still produce hallucinations or fail to return relevant responses. The huge improvements from GPT-3.5 to GPT-4 have allowed us to feel confident integrating the technology into our platform, but these features still require human verification. As the field continues to improve, we believe that more of these concerns will become resolved, and we’re confident that the path forward involves embracing AI in the research process.
To date, GPT-4 is known to have a certain amount of associated latency. This affects us most with our Enable Assistant; in this first release, 10-30 second API responses may be expected. We do expect that this will improve over time.
Generative Biological Search is available on the Enable Cloud Platform starting today! You can access it in all of the above locations by signing up for an Enable account.
These features are all still experimental, and we’re working hard to improve and expand their capabilities. Beyond today’s launch, we’re planning even more improvements to our platform, from adding more general biological knowledge in search, to better ways to operate on our data and interact with images more dynamically, to easier ways to summarize analyses.
Imagine having access to assistance throughout your entire workflow, guiding you not only with the process of research but providing necessary insight required. Ask it to summarize your data; to recommend next steps; to build plots and create insights from your findings; and to relate your findings to other studies. More than ever before, we believe we have the tools to chart a concrete roadmap toward this vision.
By continuing to innovate and adapt the latest in AI to science, we can help accelerate the entire field of biological research. On the Enable Cloud Platform, we hope users will be able to conduct novel biological research with increasing ease, increasing collaboration, and increasing support.
Our long-term vision hasn’t changed: a Biological Atlas to enable Biological Search and Open Research. We hope that developing our platform and Atlas will be a community-driven effort. Whether through AI and LLM expertise, or biological data and research, we are always looking for feedback and ideas, and even hope to build our own open source ecosystems. If you are a biologist, scientist, engineer, or anything in between, check out our website, get started with our platform, and reach out to us to collaborate on our vision for the future of science.
Interested in collaborating with us to leverage the latest in high-parameter biological tools and uncover new biology, have a question about our platform, or something else? Please reach out!
Interested in collaborating with us to leverage the latest in high-parameter biological tools and uncover new biology, have a question about our platform, or something else? Please reach out!