Cenic.org

From the Ground to the Stars: Critical Big-Data Research in Africa

Categories Cultural & Scientific RENS & NRENS

Tags africa astronomy genomics geosciences

Astronomy in Africa

The sciences of observational astronomy, biology, medicine, geology, and genomics have many things in common, but two of the biggest are their dependence on high-performance networking and high-throughput computing (HTC) and the critical importance of cutting-edge research and researchers based in Africa. Many of the largest and most subscribed telescope arrays in the world are in the southern hemisphere – e.g. Chile, Australia, and South Africa, which offer excellent viewing conditions.

Many locations in Africa offer equally superb conditions for biological, geological, and seismic field stations, and African researchers are in the top tier pursuing globally vital innovations in medicine, epidemiology, and genomics.

Astronomy in Africa and its Growing Data Rates

Radio astronomy is a perfect example of the importance of regional and global networking and high-throughput computing. The telescope arrays used in radio astronomy consist of multiple dishes interconnected across large areas. Furthermore, the amount of data they can generate and process is staggering, and the facilities must be made available to a global community of researchers.

Consider the South African Radio Astronomy Observatory (SARAO)’s Square Kilometre Array (SKA). When it becomes fully functional, the SKA will generate up to 960,000 Tb each day, and each radio dish will require a rate of ~160 Gbps to a central processor. The SKA will observe at two different frequencies (with the mid-frequency observation taking place in South Africa), each of which can generate between 0.5-1 Tbps of data* – after the raw data are reduced. (The SKA precursor array, MeerKat, is currently being used by NSF-funded projects for analysis and publications.)

One example of an NSF-funded experiment with direct benefit to the CENIC community is the Hydrogen Epoch of Reionization Array (HERA, NSF award #1836019) currently in operation in Meerkat National Park in South Africa. HERA’s radio telescope array is dedicated to observing large-scale structure in the early stages of the universe, and it can output 4-8TB of continuous data each night, generating up to a Petabyte of data in a typical season.

HERA also offers mentorship and research programs aimed at students in California and Arizona: the California-Arizona Minority Partnership for Astronomy Research and Education (CAMPARE) program and the CAMPARE-HERA Astronomy Minority Partnership (CHAMP).

Rural and National Parks

Globally Networked Biological Field Stations in Africa

Like telescopes which must be placed where viewing conditions are best, biological field stations must also be placed in sensitive, specific locations, yet they and the data they generate must be available to researchers the world over.

The International Organization of Biological Field Stations (IOBFS) includes field stations, marine labs, and research centers all around the world, with several critical ones located in vastly diverse environments in Africa: the Nigerian Montane Forest Project, the Bouamir Research Camp in Cameroon’s Congo Basin Rainforest, the Makerere University Biological Field Station in Uganda, the Senckenberg Biodiversity and Climate Research Centre's Nkweseko Scientific Station in Tanzania, and stations located in Kruger National Park in South Africa.

Internationally Impactful Medical and Genomic Research

As the COVID-19 pandemic illustrated, effective response to infectious diseases requires global collaboration and global networking. The National Institutes of Health (NIH) sponsor and collaborate on medical projects around the world, many of which need high-performance networking thanks to the explosive growth of genomics and high-resolution medical imaging and video.

In addition to these, another collaborative initiative between the NIH and African medical research communities is the United States Army Medical Research Directorate Kenya (USAMRD-K). The focus of this Nairobi-based program is the development of drugs and vaccines for malaria and other tropical diseases.

Yet another example of big-data medical research in Africa is the genomics and bioinformatics project Human Health and Heredity (H3 Africa), which includes 48 projects on population-based genomic studies of common, non-communicable disorders like heart and renal disease as well as communicable diseases like tuberculosis; such research clearly requires regional, international, and global high-performance networking to be effective and timely.

Quakes Tsunamis and Nuclear Monitoring

Quakes, Tsunamis, and Nuclear Monitoring: Geoscience in Africa

Earth sciences – geodesy, seismology, and geology – by definition require global coverage and real-time availability of big data worldwide. Two examples of sensor networks that provide coverage and data – and that require networking and cyberinfrastructure to do so – are the Global Seismographic Network (GSN) and the AfricaArray Seismic Network.

GSN is a cooperative scientific facility operated jointly by the NSF and the US Geological Survey (USGS). The network consists of 152 stations around the world, including many across Africa, and provides free, real-time, open-access data for research as well as for tsunami warning response and monitoring for the Comprehensive Nuclear Test Ban Treaty Organization.

The AfricaArray Seismic Network was developed as a partnership between the University of the Witwatersrand in South Africa and the Pennsylvania State University. 57 stations are located across the continent – including Ghana, Nigeria, and Cameroon in West Africa and many other nations from Ethiopia to the southernmost tip of South Africa, including Cape Verde in the Atlantic Ocean and Madagascar in the Indian Ocean.

AfricaArray is more than a network, however. Academic and research programs are also among its prominent contributions to global geoscience along with its commitment to natural resource management.

Africa-4

AmLight: A Big-Data Pathway for Africa-US Collaboration

And of course, all of the above big-data projects require integration into the global high-performance network and cyberinfrastructure environments.

In particular, several of the projects listed above would benefit from the use of the AmLight-ExP network (NSF Award #2029283) for access to CI resources in the US. Via its connection to the Brazilian research and education network RNP, it connects to Africa in Angola and from there to the West Africa Cable System (WACS) and on to Cape Town, South Africa. Significant big-data research projects, many involving CENIC member institutions, use AmLight to enable global collaboration.

AmLight plays a significant role in facilitating US-Africa research partnerships by matching up researchers with instruments, data, and each other. Researchers at CENIC member institutions who are interested in taking advantage of this pathway to collaborators in Africa are encouraged to contact CENIC or Vasilka Chergarova, IT Assistant Director at Florida International University at vchergar@fiu.edu to learn more.

CENIC gratefully acknowledges the help of Vasilka Chergarova (Florida International University Center for Internet Augmented Research & Assessment), Nell Cox (Emory University), and Heidi Morgan (University of Southern California Information Sciences Institute Networking & Cybersecurity) in the preparation of this article.

* A previous version of this article erroneously stated that the data rate would be 7 Tbps after processing.

Related blog posts

Get Your Cybersecurity Program Up and Running with the Trusted CI Framework

Solving Network Connections for USC on Catalina Island