Piotr Kica, Michał Orzechowski, Maciej Malawski

Aligning RNA sequences to a reference genome is essential for analyzing gene expression, detecting genetic variations, and conducting transcriptomic research. Among the most efficient and accurate tools for this purpose is the STAR aligner , a computationally demanding yet widely adopted solution in modern transcriptomics.
Serverless computing models such as Function-as-a-Service (FaaS) and Container-as-a-Service (CaaS) have emerged as flexible frameworks for executing diverse workloads, including batch and event-driven tasks . These architectures offer key advantages over conventional infrastructure—mainly high scalability, minimal maintenance, and cost efficiency through pay-per-use billing. Nonetheless, each service type comes with its own constraints, which may influence their suitability for running STAR-based RNA sequence processing.
Previous studies have demonstrated the feasibility of executing lightweight aligners, such as HiSat2, within serverless environments . By partitioning input FASTQ files into smaller chunks, researchers effectively parallelized computations, achieving notable performance improvements. Despite this progress, STAR remains superior in both alignment accuracy and potential processing speed .
Deploying STAR efficiently in a serverless setup poses several challenges, primarily due to its reliance on a large prebuilt genome index that must reside in memory during execution. For the human genome, this index typically occupies around 30 GB of space—surpassing memory limits imposed by many cloud services.


The objectives of this study include:
• Identifying serverless platforms capable of supporting STAR alignment tasks,
• Conducting performance tests of the STAR aligner in a serverless environment,
• Comparing cost-efficiency between serverless and conventional computing models,
• Exploring optimization strategies for resource management,
• Proposing practical scenarios for deploying STAR in serverless settings.

Read/Download