Ayush Garg

Author of Pygetpapers, Researcher

I am a passionate developer who has keen interests in scientific research, especially in the open science sphere. My primary research interests include computer vision, bioinformatics , and text mining.

Tech Stack

Python IconPython
Git IconGit
JavaScript Icon JavaScript
React IconReact
Java IconJava


Richmond, VA


[Scientific research, Artificial Intelligence, Open Science]


Click Here


[Hackathons, Cooking, Music]


Ayush Garg is a highly motivated and skilled developer with a strong background in research and a passion for open science. He is currently pursuing his research on grounded NLP at the University of Richmond, under the guidance of Dr. Catherine Finegan-Dollak.
His research aims to translate human instructions into computer-executable commands to simulate video games, and he has acquired a deep understanding of machine learning and the connection between different machine learning domains.
He has extensive experience in developing scientific libraries and has made over 600 contributions to open source projects in the past year. He is dedicated to promoting open source practices and advocating for free speech, democratic values, and the cooperative progress of humanity. As an extrovert, he is eager to help others and grow professionally.


Ayush is known for his attention to detail and ability to take on responsibilities and initiatives.
He is able to view, communicate, and analyze situations from multiple perspectives, which helps him find effective solutions to problems.
He is also proactive in finding and taking advantage of opportunities to further develop his skills and contribute to the community.


Ayush has received prestigious awards at various hackathons, including the University of Virginia, University of British Columbia, and Nanyang Technical University.
He has also collaborated with the team to present a project to the president of Singapore at the DigitalForLife festival.
His research has been published in Research Ideas and Outcomes and Journal of Open-Source Software.


Ayush's ultimate goal is to make a positive impact on the scientific world and give back to the community. He is dedicated to addressing pressing issues and driving technological advancements through innovation.
His experience as a developer of Docanalysis, where he implemented entity recognition using Natural Language Processing and his research with Dr. Peter Murray Rust, University of Cambridge, where he developed a modular python library with command-line integration Pygetpapers to retrieve scientific literature from Open repositories, are a testimony to his goal-oriented approach.
He also aspires to promote open access to research literature and encourage greater inclusivity within the scientific community.


Python IconBioinformatics
Python IconComputer Vision
Python IconText mining

Ayush's research with Dr. Catherine Finegan-Dollak at the University of Richmond focuses on utilizing grounded NLP to translate human instructions into computer-executable commands. This research aims to bridge the gap between human communication and machine understanding by allowing for more efficient and effective interactions between the two.
The research is focused on developing natural language understanding systems that can accurately interpret human instructions and execute them in a simulated environment, such as a video game. This involves using techniques such as language grounding, semantic parsing, and reinforcement learning. The goal is to create systems that can understand and act on human instructions in a way that is more natural and intuitive for humans.
In summary, Ayush's research on grounded NLP aims to develop natural language understanding systems that can accurately interpret human instructions and execute them in a simulated environment, thus allowing for more efficient and effective interactions between humans and machines.

pygetpapers has been developed to allow searching of the scientific literature in repositories with a range of textual queries and metadata. It downloads content using APIs in an automated fashion and is designed to be extensible to the growing number of Open Access repositories.

This JOSS article further elaborates the design of the tool.

An increasing amount of research, particularly in medicine and applied science, is now based on meta-analysis and sytematic review of the existing literature (example). In such reviews scientists frequently download thousands of articles and analyse them by Natural Language Processing (NLP) through Text and Data Mining (TDM) or Content Mining (ref). A common approach is to search bibliographic resources with keywords, download the hits, scan then manually and reject papers that do not fit the criteria for the meta-analysis. The typical text-based searches on sites are broad, with many false positives and often only based on abstracts. We know of cases where systematic reviewers downloaded 30,000 articles and eventually used 30. Retrieval is often done by crawling / scraping sites, such as journals but is easier and faster when these articles are in Open Access repositories such as arXiv, Europe/PMC biorxiv, medrxiv. But each repository has its own API and functionality, which makes it hard for individuals to (a) access (b) set flags (c) use generic queries.

In 2015 we reviewed tools for scraping websites and decided that none met our needs and so developed getpapers, with the key advance of integrating a query submission with bulk fulltext-download of all the hits. getpapers was written in NodeJs and has now been completely rewritten in Python3 (pygetpapers) for easier distribution and integration. Typical use of getpapers is shown in a recent paper where the authors "analyzed key term frequency within 20,000 representative [Antimcrobial Resistance] articles".

Unsupervised entity extraction from sections of papers that have defined boilerplates. Examples of such sections include - Ethics Statements, Funders, Acknowledgments, and so on.

Primary Purpose:

Extracting Ethics Committees and other entities related to Ethics Statements from papers
Curating the extracted entities to public databases like Wikidata
Building a feedback loop where we go from unsupervised entity extraction to curating the extracted information in public repositories to then, supervised entity extraction.

Subsidary Purpose(s):

The use case can go beyond Ethics Statements. docanalysis is a general package that can extract relevant entities from the section of your interest.
Sections like Acknowledgements, Data Availability Statements, etc., all have a fairly generic sentence structure. All you have to do is create an ami dictionary that contains boilerplates of the section of your interest. You can, then, use docanalysis to extract entities. Check this section which outlines steps for creating custom dictionaries. In case of acknowledgements or funding, you might be interested in the players involved. Or you might have a use-case which we might have never thought of!


Podcast Productions


Video Productions


Scientific Talks

Code Icon


Here are the projects I have worked on.
Also take a look at my GitHub profile for more details.

Project Preview Screenshot

Science.org.in is a social media platform for science enthusiasts. Message friends, add to your timeline and connect to groups.

Project Preview Screenshot
HTML5 CSS3 JavaScript

A website to cater to all your technological curbs

Project Preview Screenshot
Python Js Html CSS

AI powered chat bot serving people on lionsbefrienders

Project Preview Screenshot
Python Flutter

FinLearn gamifies teaching financing to youth. It uses its own in App blockchain-based currency called FinCoin which gets rewarded to players when they complete their daily tasks which include things like making their bed, studying, going for physical activity, etc.

Project Preview Screenshot
Python Html Css Js

CoVax serves as a user-friendly medium of interaction between the three main strata involved in the process of vaccinations: the suppliers, the medical professionals, as well as the general public.
Our first app is an inventory-based application that utilizes unique QR codes to communicate to hospitals about the availability of vaccines. Each vaccine and each shelf are associated with a unique ID, which we use to identify stock.
We have a vision where security camera feed can be input into our programme to update the stock of vaccines in real time as they are being taken in and out of the cold storages.

Project Preview Screenshot
React Html CSS Firebase Js

The Protect-21 application sends a friendly reminder notification to the user to wear their mask every time they leave / depart from key locations (such as their home and place of business).
Our application will also encourage the proper wear of their mask and verify them to minimize the risk of exposure.

Project Preview Screenshot
HTML5 CSS3 JavaScript React

Teach 4A Cause is an online learning-crowdfunding platform where specialized teachers from indicated backgrounds or professional can create a class, and the earnings would be utilized for helping other people/ a cause.
As Campaign Host, they can teach a class in any field they are passionate about.
Think of it as donating your time for impact — so they can use their resources to make the most effective impact for a good cause :)

Project Preview Screenshot
HTML5 CSS3 JavaScript Python Firebase

The customers get a QR code which is printed onto a band when they first sign up for the program.
Each time they enter a bar, they scan this QR code and the bars get updated about their information, like whether they are above the legal drinking age, as well as their gender, weight, and payment method (which are all confirmed at the time of registration and updated periodically and dynamically).
When the customers want to order a drink, they scan the QR code on their wristbands or mobile phones, which add the drink they ordered to their tab and update the approximated biometrics (We predict their BAC using mathematical models).

Project Preview Screenshot
HTML5 CSS3 Arduino

When's the last time you lost your keys? Had that rotten bell-pepper in the back of the fridge that you never got to? What if we could make space-missions cleaner and more efficient?
In space, keeping track of items such as tools, scientific equipment, medical supplies, personal belongings, food and more – is mission-critical.
It's costly, damaging to the environment, and highly-complex to send up new items missed or lost during long-term space missions.
What do these problems have in common? Sachen solves both of them. We're the automatic inventory management system for everybody.


Hackathon Mentoring


Get in Touch

Please Fill Required Fields