iNaturalist, RCSB PDB, and NCBI
===============================
In this section, we will explore three popular APIs in the field of bioinformatics: `iNaturalist `_,
`RCSB Protein Data Bank `_ (Research Collaboratory for Structural Bioinformatics), and
`NCBI `_ (National Center for Biotechnology Information). These APIs provide
access to a wealth of biological data, including species observations, protein structures, and genomic
information. After going through this module, students should be able to:
- Understand the purpose and functionality of each API.
- Make API requests to retrieve data from each platform.
- Parse and utilize the retrieved data for various applications in bioinformatics.
iNaturalist
-----------
iNaturalist is a citizen science project and online social network of naturalists, citizen scientists,
and biologists built on the concept of mapping and sharing observations of biodiversity across the globe.
The hundreds of thousands of members share close to one million observations of plants, animals, fungi,
and other organisms every month. The iNaturalist API allows users to access data about species observations,
including information about the location, date, and species observed.
.. figure:: images/iNaturalist.png
:width: 600px
:align: center
iNaturalist main site.
Let's take a look at the `iNaturalist API documentation `_ to understand how to make requests and retrieve data.
.. figure:: images/iNaturalist_api.png
:width: 600px
:align: center
iNaturalist API documentation.
As the iNaturalist API documentation shows, we can make requests to retrieve observations. Since it is a
standard RESTful API, we could use the ``requests`` library in Python to interact with it. But there is an
easier way to interact with the iNaturalist API using the ``pyinaturalist`` library, which provides a more
user-friendly interface for accessing the API. So let's install the ``pyinaturalist`` library.
.. code-block:: console
[mbs337-vm]$ cd $HOME/mbs-337
[mbs337-vm]$ source .venv/bin/activate
(.venv) [mbs337-vm]$ pip3 install pyinaturalist
(.venv) [mbs337-vm]$ pip3 list
Package Version
-------------------- -----------
annotated-types 0.7.0
attrs 25.4.0
biopython 1.86
cattrs 26.1.0
certifi 2026.1.4
cffi 2.0.0
charset-normalizer 3.4.4
cryptography 46.0.5
idna 3.11
iniconfig 2.3.0
jaraco.classes 3.4.0
jaraco.context 6.1.0
jaraco.functools 4.4.0
jeepney 0.9.0
keyring 25.7.0
markdown-it-py 4.0.0
mdurl 0.1.2
more-itertools 10.8.0
numpy 2.4.1
packaging 26.0
pip 24.0
platformdirs 4.9.2
pluggy 1.6.0
pycparser 3.0
pydantic 2.12.5
pydantic_core 2.41.5
Pygments 2.19.2
pyinaturalist 0.21.1
pyrate-limiter 2.10.0
pytest 9.0.2
python-dateutil 2.9.0.post0
redis 7.2.0
requests 2.32.5
requests-cache 1.3.0
requests-ratelimiter 0.8.0
rich 14.3.3
SecretStorage 3.5.0
six 1.17.0
typing_extensions 4.15.0
typing-inspection 0.4.2
url-normalize 2.2.1
urllib3 2.6.3
Now that we have the ``pyinaturalist`` library installed, we can start making requests to the iNaturalist API.
Before we dive into the code, let's take a moment to look at the `API documentation `_
for the ``pyinaturalist`` library to understand how to use it effectively.
.. figure:: images/iNaturalist_api_docs_get_observations.png
:width: 600px
:align: center
iNaturalist API reference for `get_observations`.
OK, let's try to retrieve some observations for a 1 km radius around the coordinates (30.2895, -97.7368) which
is the location of the University of Texas at Austin for a 1 week period. We can use the following code to do
this:
.. code-block:: console
[mbs337-vm]$ python3
Python 3.12.3 (main, Jan 22 2026, 20:57:42) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyinaturalist as pin
>>> from rich import print
>>>
>>> obs = pin.get_observations(lat="30.2895", lng="-97.7368", radius=1, d1="2026-02-18", d2="2026-02-24")
>>> pin.pprint(obs)
ID Taxon ID Taxon Observed on User Location
-------------------------------------------------------------------------------------------------------------------------------------
339941488 18205 Melanerpes carolinus (Red-Bellied Feb 23, 2026 johnathan12034 W 30th St, Austin, TX, US
Woodpecker)
339919433 1427972 Irpex latemarginatus (Frothy Feb 23, 2026 kirsten24 701 Dean Keeton/San Jacinto, Austin,
Porecrust) TX 78705, USA
339917918 81708 Aesculus pavia (Red Buckeye) Feb 23, 2026 kirsten24 Travis County, US-TX, US
339841781 118492 Helicoverpa zea (Corn Earworm Moth) Feb 22, 2026 kuramazilla Speedway, Austin, TX, US
339835262 43111 Sylvilagus floridanus (Eastern Feb 21, 2026 lauren1414 W 24th St, Austin, TX, US
Cottontail)
339813315 164229 Jasminum mesnyi (Primrose Jasmine) Feb 19, 2026 rebraph San Jacinto Blvd, Austin, TX, US
339806172 47126 Kingdom Plantae (Plants) Feb 22, 2026 bradc559 San Antonio St, Austin, TX, US
339726755 54900 Papilio polyxenes asterius (Eastern Feb 21, 2026 utfarmstand The University of Texas at Austin,
Black Swallowtail) Austin, TX, US
339668489 164038 Ilex cornuta (Chinese Holly) Feb 21, 2026 liljegrenv Rio Grande St, Austin, TX, US
339657223 4956 Ardea herodias (Great Blue Heron) Feb 21, 2026 vivian38785 San Jacinto Blvd, Austin, TX, US
339645270 8229 Cyanocitta cristata (Blue Jay) Feb 21, 2026 chasek29 701 Dean Keeton/San Jacinto, Austin,
TX 78705, USA
339645214 13858 Passer domesticus (House Sparrow) Feb 21, 2026 vivian38785 Rio Grande St, Austin, TX, US
339642368 48502 Cercis canadensis (Eastern Redbud) Feb 21, 2026 chasek29 701 Dean Keeton/San Jacinto, Austin,
TX 78705, USA
339642071 47351 Genus Prunus (Plums, Cherries, And Feb 21, 2026 chasek29 701 Dean Keeton/San Jacinto, Austin,
Allies) TX 78705, USA
339573515 9607 Quiscalus mexicanus (Great-Tailed Feb 21, 2026 avi_subramanian Austin
Grackle)
339501990 41663 Procyon lotor (Common Raccoon) Feb 20, 2026 kuramazilla E 24th St, Austin, TX, US
339495572 14886 Mimus polyglottos (Northern Feb 18, 2026 mariaks16 W 24th St, Austin, TX, US
Mockingbird)
339488649 57056 Medicago lupulina (Black Medick) Feb 20, 2026 adrianj Red River St, Austin, TX, US
339365066 103498 Ischnura posita (Fragile Forktail) Feb 19, 2026 etaan Cedar St, Austin, TX, US
339331168 47124 Class Magnoliopsida (Dicots) Feb 19, 2026 lexi_moffett The University of Texas at Austin,
Austin, TX, US
339202074 Feb 18, 2026 chrismyzoo Austin
339197531 1555999 Nephroia carolina (Carolina Feb 18, 2026 utfarmstand E 21st St, Austin, TX, US
Snailseed)
>>>
Another nice thing we can do with the ``pyinaturalist`` library is to use their data models. This allows us to
work with the data in a more structured way as opposed to working with raw dictionaries. For example, we can use
the ``Observation`` data model to access observation attributes more easily and take a look at one observation.
.. code-block:: console
>>> my_obs = pin.Observation.from_json_list(obs)
>>> type(my_obs[14])
>>> print(my_obs[14])
Observation(
id=339573515,
created_at='2026-02-21 09:11:43-06:00',
captive=False,
community_taxon_id=9607,
identifications_count=3,
identifications_most_agree=True,
identifications_most_disagree=False,
identifications_some_agree=True,
location=(30.2868747711, -97.7400512695),
mappable=True,
num_identification_agreements=3,
num_identification_disagreements=0,
oauth_application_id=333,
obscured=False,
observed_on='2026-02-21 09:11:37-06:00',
owners_identification_from_vision=True,
place_guess='Austin',
place_ids=[
1,
18,
431,
9853,
53217,
53218,
53222,
59613,
60211,
62332,
63856,
64422,
64423,
65181,
66741,
67465,
68119,
80998,
82256,
97394,
113590,
124748,
146145,
148549,
151222,
151232,
160119
],
positional_accuracy=15,
preferences={'prefers_community_taxon': None},
public_positional_accuracy=15,
quality_grade='research',
reviewed_by=[115129, 3953595, 4483440, 8880881],
site_id=1,
species_guess='Great-tailed Grackle',
taxon_geoprivacy='open',
updated_at='2026-02-21 14:06:02-06:00',
uri='https://www.inaturalist.org/observations/339573515',
uuid='26673574-3cd1-470c-a6b9-ad36b4d8a580',
annotations=[],
application=None,
comments=[],
faves=[],
flags=[],
identifications=[
Identification(
id=765249631,
username='isaaceastland',
taxon_name='Quiscalus mexicanus (Great-Tailed Grackle)',
created_at='Feb 21, 2026',
truncated_body=''
),
Identification(
id=765180592,
username='avi_subramanian',
taxon_name='Quiscalus mexicanus (Great-Tailed Grackle)',
created_at='Feb 21, 2026',
truncated_body=''
),
Identification(
id=765182296,
username='bobthebob101',
taxon_name='Quiscalus mexicanus (Great-Tailed Grackle)',
created_at='Feb 21, 2026',
truncated_body=''
),
Identification(
id=765289717,
username='aguilita',
taxon_name='Quiscalus mexicanus (Great-Tailed Grackle)',
created_at='Feb 21, 2026',
truncated_body=''
)
],
ofvs=[],
photos=[Photo(id=617763422, url='https://static.inaturalist.org/photos/617763422/square.jpg')],
project_observations=[],
quality_metrics=[],
sounds=[],
taxon=Taxon(id=9607, full_name='Quiscalus mexicanus (Great-Tailed Grackle)'),
user=User(id=4483440, login='avi_subramanian', name='Avi Subramanian'),
votes=[]
)
Since we are using the data model, we can easily access the attributes of the observation.
For example, we can access the taxon name.
.. code-block:: console
>>> print(my_obs[14].taxon.full_name)
Quiscalus mexicanus (Great-Tailed Grackle)
And we can also access the photos associated with the observation.
.. code-block:: console
>>> print(my_obs[14].photos)
[
Photo(
id=617763422,
attribution='(c) Avi Subramanian, all rights reserved',
original_dimensions=(1152, 2048),
url='https://static.inaturalist.org/photos/617763422/square.jpg'
)
]
.. figure:: images/observation_photo_grackle_medium.jpg
:align: center
Photo of the Great-Tailed Grackle observation.
RCSB Protein Data Bank
----------------------
The RCSB Protein Data Bank (PDB) is a repository for the 3D structural data of large biological molecules,
such as proteins and nucleic acids. The PDB provides a wealth of information about the structure and function
of these molecules, which is crucial for understanding biological processes and developing new drugs.
The RCSB PDB API allows users to access this structural data programmatically, enabling researchers to retrieve
information about specific proteins, their structures, and related data.
.. figure:: images/rcsb-pdb.png
:width: 600px
:align: center
RCSB PDB main page.
The RCSB PDB has multiple APIs available, including a Search API for querying the database and a Data API for
retrieving detailed information about specific entries. Let's first take a look at the `RCSB PDB Search API documentation `__.
.. figure:: images/rcsb-pdb-search-api.png
:width: 600px
:align: center
RCSB PDB Search API documentation page.
The Search API allows us to perform complex queries to find specific entries in the PDB and is designed to return
only identifiers (and some additional metadata) for the hits that match the search criteria. The basic idea is to
send a GET request to `https://search.rcsb.org/rcsbsearch/v2/query?json={search-request}` where ``{search-request}``
is a structured JSON object that specifies the search criteria. Something like:
.. code-block:: json
{
"query": {
"type": "terminal",
"service": "full_text",
"parameters": {
"value": "thymidine kinase"
}
},
"return_type": "entry"
}
Again, we could use the lower-level Python ``requests`` library to interact with the Search API (and Data API), but
there is a more convenient way to interact with the RCSB PDB APIs using the ``rcsb-api`` library, which provides a
more user-friendly interface for accessing the APIs. So let's install it.
.. code-block:: console
[mbs337-vm]$ cd $HOME/mbs-337
[mbs337-vm]$ source .venv/bin/activate
(.venv) [mbs337-vm]$ pip3 install rcsb-api
(.venv) [mbs337-vm]$ pip3 list
Package Version
-------------------- -----------
annotated-types 0.7.0
anyio 4.12.1
attrs 25.4.0
biopython 1.86
cattrs 26.1.0
certifi 2026.1.4
cffi 2.0.0
charset-normalizer 3.4.4
cryptography 46.0.5
graphql-core 3.2.7
h11 0.16.0
httpcore 1.0.9
httpx 0.28.1
idna 3.11
iniconfig 2.3.0
jaraco.classes 3.4.0
jaraco.context 6.1.0
jaraco.functools 4.4.0
jeepney 0.9.0
keyring 25.7.0
markdown-it-py 4.0.0
mdurl 0.1.2
more-itertools 10.8.0
nest-asyncio 1.6.0
numpy 2.4.1
packaging 26.0
pip 24.0
platformdirs 4.9.2
pluggy 1.6.0
pycparser 3.0
pydantic 2.12.5
pydantic_core 2.41.5
Pygments 2.19.2
pyinaturalist 0.21.1
pyrate-limiter 2.10.0
pytest 9.0.2
python-dateutil 2.9.0.post0
rcsb-api 1.5.0
redis 7.2.0
requests 2.32.5
requests-cache 1.3.0
requests-ratelimiter 0.8.0
rich 14.3.3
rustworkx 0.17.1
SecretStorage 3.5.0
six 1.17.0
tqdm 4.67.3
typing_extensions 4.15.0
typing-inspection 0.4.2
url-normalize 2.2.1
urllib3 2.6.3
With the ``rcsb-api`` library installed, we can start making requests to the RCSB PDB APIs. Let's first take a
look at the `rcsb-api documentation `_ to understand how to use the
library effectively.
.. figure:: images/rcsb-api-docs.png
:width: 600px
:align: center
rcsb-api documentation page.
The first thing we're going to do is to use the Search API to find entries in the PDB that match a specific query.
For example, let's search for entries that contain the term "Hemoglobin". We can use the following code to do this:
.. code-block:: console
[mbs337-vm]$ python3
Python 3.12.3 (main, Jan 22 2026, 20:57:42) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from rcsbapi.search import TextQuery
>>>
>>> query = TextQuery(value="Hemoglobin")
>>>
>>> results = query()
>>> results_list = list(results)
>>> len(results_list)
8918
>>> for rid in sorted(results_list):
>>> print(rid)
101M
102M
103M
104M
105M
106M
107M
108M
109M
10NH
110M
111M
112M
155C
...
4HGJ
4HHB
4HHR
...
9YVV
9ZKF
9ZLJ
9ZLM
Now that we have a list of entry IDs that match our search query, we can use the Data API to retrieve detailed
information about a specific entry. For example, let's retrieve information about the entry with ID "4HHB", which
is the PDB ID for human hemoglobin. As we did with the Search API, let's first take a look at the
`RCSB PDB Data API documentation `__.
.. figure:: images/rcsb-pdb-data-api.png
:width: 600px
:align: center
RCSB PDB Data API documentation page.
As you can see from the Data API documentation, there are two ways to retrieve data for a specific entry: using
the RESTful API or using `GraphQL `_. The RESTful API is a standard way to interact with
the API using HTTP requests, while GraphQL is a more flexible query language that allows you to specify exactly
what data you want to retrieve. Since we have already installed the ``rcsb-api`` library, we can use it to
interact with the Data API in a more convenient way.
To retrieve information about the entry with ID "4HHB", we can use the following code:
.. code-block:: console
>>> from rcsbapi.data import DataQuery as Query
>>>
>>> query = Query(
... input_type="entries",
... input_ids=["4HHB"],
... return_data_list=["exptl.method", "struct.title"]
... )
>>>
>>> result = query.exec()
>>>
>>> type(result)
>>> print(result)
{'data': {'entries': [{'rcsb_id': '4HHB', 'exptl': [{'method': 'X-RAY DIFFRACTION'}], 'struct': {'title': 'THE CRYSTAL STRUCTURE OF HUMAN DEOXYHAEMOGLOBIN AT 1.74 ANGSTROMS RESOLUTION'}}]}}
>>>
>>> print(query.get_query())
query{entries(entry_ids: ["4HHB"]){
rcsb_id
exptl{
method
}
struct{
title
}}}
>>>
Downloading PDB files using BioPython
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In previous sections, we have used BioPython's PDB package to parse PDB files. Now let's see
how we can use BioPython to download PDB files directly from the RCSB PDB database. This can be done using
the ``PDBList`` class from the ``Bio.PDB`` module (see `docs `__).
.. code-block:: console
>>> from Bio.PDB import PDBList
>>>
>>> pdb_list = PDBList()
>>>
>>> pdb_list.retrieve_pdb_file("4HHB", file_format="mmCif", pdir=".")
Downloading PDB structure '4hhb'...
'./4hhb.cif'
>>>
[mbs337-vm]$ ls -l
total 764
-rw-r--r-- 1 ubuntu ubuntu 540 Feb 21 18:39 4HHB_summary.json
-rw-rw-r-- 1 ubuntu ubuntu 764822 Feb 25 02:16 4hhb.cif
drwxrwxr-x 4 ubuntu ubuntu 4096 Feb 19 17:58 docker-exercise
NCBI
----
`NCBI `_ (National Center for Biotechnology Information) is a part of the United States National Library of Medicine,
a branch of the National Institutes of Health. NCBI provides access to a wide range of biological data,
including genomic sequences, protein sequences, and literature. The NCBI APIs allow users to access this data
programmatically, enabling researchers to retrieve information about specific genes, proteins, and other
biological entities.
.. figure:: images/ncbi.png
:width: 600px
:align: center
NCBI main page.
NCBI also provides multiple `APIs `_, including the E-utilities API for accessing all the Entrez databases.
.. figure:: images/ncbi-api.png
:width: 600px
:align: center
NCBI APIs page.
`Entrez `_ is a search and retrieval system that provides access to a
wide range of biological data, including genomic sequences, protein sequences, and literature. It search
databases like PubMed, GenBank, GEO, and many others. The E-utilities API allows users to access this data
programmatically, enabling researchers to retrieve information about specific genes, proteins, and other
biological entities.
Again, we can turn to the BioPython library to interact with the NCBI APIs in a more convenient way.
BioPython provides the ``Entrez`` module for accessing the NCBI APIs (see `docs `__).
Searching, downloading, and parsing GenBank records
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For example, let's say we're working with Arabidopsis thaliana (thale cress), a small plant that is a popular
model organism in plant biology, and we want to retrieve the GenBank record for a gene with locus ``AT1G65480``.
This is a protein-coding gene on chromosome 1 that promotes flowering. Let's first search for the gene so we
can get its GenBank ID, and then we can use that ID to retrieve the GenBank record and parse it using BioPython.
.. code-block:: console
[mbs337-vm]$ python3
Python 3.12.3 (main, Jan 22 2026, 20:57:42) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import Entrez, SeqIO
>>>
>>> Entrez.email = "A.N.Other@example.com"
>>>
>>> with Entrez.esearch(db="protein", term="AT1G65480") as h:
... results = Entrez.read(h)
... type(results)
... print(results)
...
{'Count': '28', 'RetMax': '20', 'RetStart': '0', 'IdList': ['3178757816', '17432933', '2549168764', '2549167280', '2549167260', '2549163309', '2549152528', '332658914', '1063695107', '15237061', '15218709', '1820247506', '1315962760', '1315962758', '1315962757', '1315946694', '1315946693', '1039007658', '332196260', '508716688'], 'TranslationSet': [], 'TranslationStack': [{'Term': 'AT1G65480[All Fields]', 'Field': 'All Fields', 'Count': '28', 'Explode': 'N'}, 'GROUP'], 'QueryTranslation': 'AT1G65480[All Fields]'}
>>>
We'll choose the second ID in the list, ``17432933``, which is the GenBank ID for the protein sequence of the gene.
.. code-block:: console
>>> gb_rec = None
>>> with Entrez.efetch(db="protein", id="17432933", rettype="gb", retmode="text") as h:
... record = SeqIO.parse(h, "gb")
... rec_list = list(record)
... gb_rec = rec_list[0]
...
>>>
>>> type(gb_rec)
>>>
>>> print(f"ID: {gb_rec.id}\nName: {gb_rec.name}\nDescription: {gb_rec.description}\nSequence: {gb_rec.seq}")
ID: Q9SXZ2.2
Name: FT_ARATH
Description: RecName: Full=Protein FLOWERING LOCUS T
Sequence: MSINIRDPLIVSRVVGDVLDPFNRSITLKVTYGQREVTNGLDLRPSQVQNKPRVEIGGEDLRNFYTLVMVDPDVPSPSNPHLREYLHWLVTDIPATTGTTFGNEIVCYENPSPTAGIHRVVFILFRQLGRQTVYAPGWRQNFNTREFAEIYNLGLPVAAVFYNCQRESGCGGRRL
>>>
PubMed and Medline
~~~~~~~~~~~~~~~~~~
To continue with our example, let's say we want to find literature related to the gene ``AT1G65480``.
We can use the PubMed database to search for articles that mention this gene.
.. code-block:: console
[mbs337-vm]$ python3
Python 3.12.3 (main, Jan 22 2026, 20:57:42) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import Entrez, Medline
>>>
>>> Entrez.email = "A.N.Other@example.com"
>>>
>>> idlist = None
>>> with Entrez.esearch(db="pubmed", term="AT1G65480") as h:
... record = Entrez.read(h)
... idlist = record["IdList"]
...
>>> idlist
['31219634', '31009078', '26132805', '19825833']
It looks like there are 4 articles that mention the gene ``AT1G65480``. Let's retrieve the details of the
third article in the list, ``26132805``.
.. code-block:: console
>>> art_list = None
>>> with Entrez.efetch(db="pubmed", id="26132805", rettype="medline", retmode="text") as h:
... records = Medline.parse(h)
... art_list = list(records)
...
>>>
>>> article = art_list[0]
>>> type(article)
>>>
>>> print(f"ID: {article.get('PMID')}\nTitle: {article.get('TI')}\nAuthors: {article.get('AU')}\nSource: {article.get('SO')}\nAbstract: {article.get('AB')}")
ID: 26132805
Title: FT overexpression induces precocious flowering and normal reproductive development in Eucalyptus.
Authors: ['Klocko AL', 'Ma C', 'Robertson S', 'Esfandiari E', 'Nilsson O', 'Strauss SH']
Source: Plant Biotechnol J. 2016 Feb;14(2):808-19. doi: 10.1111/pbi.12431. Epub 2015 Jul 1.
Abstract: Eucalyptus trees are among the most important species for industrial forestry worldwide. However, as with most forest trees, flowering does not begin for one to several years after planting which can limit the rate of conventional and molecular breeding. To speed flowering, we transformed a Eucalyptus grandis x urophylla hybrid (SP7) with a variety of constructs that enable overexpression of FLOWERING LOCUS T (FT). We found that FT expression led to very early flowering, with events showing floral buds within 1-5 months of transplanting to the glasshouse. The most rapid flowering was observed when the cauliflower mosaic virus 35S promoter was used to drive the Arabidopsis thaliana FT gene (AtFT). Early flowering was also observed with AtFT overexpression from a 409S ubiquitin promoter and under heat induction conditions with Populus trichocarpa FT1 (PtFT1) under control of a heat-shock promoter. Early flowering trees grew robustly, but exhibited a highly branched phenotype compared to the strong apical dominance of nonflowering transgenic and control trees. AtFT-induced flowers were morphologically normal and produced viable pollen grains and viable self- and cross-pollinated seeds. Many self-seedlings inherited AtFT and flowered early. FT overexpression-induced flowering in Eucalyptus may be a valuable means for accelerating breeding and genetic studies as the transgene can be easily segregated away in progeny, restoring normal growth and form.
Additional Resources
--------------------
* `iNaturalist API documentation `_
* `pyinaturalist API documentation `_
* `PDB-101: Introduction to RCSB PDB APIs `_
* `RCSB PDB Search API documentation `__
* `RCSB PDB Data API documentation `_
* `rcsb-api documentation `_
* `NCBI APIs documentation `_
* `BioPython documentation `_
* `BioPython Tutorial and Cookbook `_