Alessandro Morari, Ph.D.
AI Performance at Scale @NVIDIA
AI Performance at Scale @NVIDIA
Technical Leader and Researcher with over 16 years making compute faster at massive scale - from the world's fastest supercomputers to today's frontier AI systems.
Currently at NVIDIA optimizing GPU kernels for top AI labs and hyperscalers. Previously contributed to IBM Summit and Sierra supercomputers (world's #1 and #2 in 2018), built high-performance distributed systems at Pacific Northwest National Laboratory, and created NYU's first graduate course on High Performance Machine Learning.
Focus areas: GPU kernel optimization, AI inference performance, large-scale system software, HPC + AI intersection.
PhD in Computer Architecture. 30+ publications. 15 patents.
Ph.D. in Computer Architecture, Polytechnic University of Catalunya.
M.Sc. in Computer Engineering, University of Rome Tor Vergata.
B.Sc. in Computer Engineering, Roma Tre University.
Outstanding Division Achievement Award, IBM (2018) — CORAL Summit and Sierra Supercomputers
Outstanding Performance Award, PNNL (2014) — Graph Engine for Multithreaded Systems (GEMS)
IPDPS Best Paper Award (2012)
IBM Top Technical Talent Award, 2017
Exemplar People Manager Award, IBM
"NVIDIA Introduces CUDA 13.1 with CUDA Tile". Calling it the largest advancement since the NVIDIA CUDA platform was inroduced in 2006, NVIDIA has launched CUDA 13.1 with CUDA Tile https://insidehpc.com/2025/12/nvidia-introduces-cuda-13-1-with-cuda-tile/
"Two DOE Supercomputers Top List of World's Fastest" - U.S. Department of Energy Oak Ridge's Summit and Lawrence Livermore's Sierra ranked #1 and #2 globally https://www.energy.gov/articles/two-doe-supercomputers-top-list-worlds-fastest
"Sierra honored with Top Supercomputing Achievement" - HPCwire Editors' Choice Award Lawrence Livermore National Laboratory https://www.llnl.gov/article/44896/sierra-honored-top-supercomputing-achievement-hpcwire
"Summit Supercomputer is Already Making its Mark on Science" - HPCwire Coverage of Summit's scientific impact and architecture https://www.hpcwire.com/2018/09/20/summit-supercomputer-is-already-making-its-mark-on-science/
"Startup Trovares Brings HPC to Graph Analytics" - HPCwire Seattle startup commercializes PNNL graph database technology https://www.hpcwire.com/2019/04/16/startup-trovares-brings-hpc-to-graph-analytics/
"A Meaningful Data Miner" - PNNL Science Highlights GEMS framework enables graph queries on billion-triple datasets https://www.pnnl.gov/science/highlights/highlight.asp?id=3949
"PNNL: A Collection of Quality HPC Research at IEEE Cluster 2015" - PNNL High-performance dictionary encoding for RDF datasets https://www.pnnl.gov/science/highlights/highlight.asp?id=4018
IEEE Micro, 2014
Alessandro Morari, Vito Giovanni Castellana, Oreste Villa, Antonino Tumeo, Jesse Weaver, David Haglin, Sutanay Choudhury, John Feo
A software stack that relies primarily on graph-based methods to implement scalable RDF databases on commodity clusters.
IEEE Computer, 2015
Vito Giovanni Castellana, Alessandro Morari, Jesse Weaver, Antonino Tumeo, David Haglin, Oreste Villa, John Feo
High-performance in-memory graph database system for web-scale semantic data.
IPDPS 2011
Alessandro Morari, Roberto Gioiosa, Robert W. Wisniewski, Francisco J. Cazorla, Mateo Valero
Developed technique to provide quantitative analysis of OS events that limit application scalability on large-scale supercomputers.
IPDPS 2012
Alessandro Morari, Roberto Gioiosa, Robert W. Wisniewski, Bryan S. Rosenburg, Todd Inglett, Mateo Valero
Quantitative analysis of TLB misses on Blue Gene/P at scale (up to 4096 cores).
IPDPS 2014
Alessandro Morari, Antonino Tumeo, Daniel G. ChavarrĂa-Miranda, Oreste Villa, Mateo Valero
Techniques for scaling irregular applications on distributed memory systems.
ICSE-SEIP 2021
Yunhui Zheng, Saurabh Pujar, Burn Lewis, Luca Buratti, Edward Epstein, Bo Yang, Jim Laredo, Alessandro Morari, Zhong Su
Production ML dataset for code vulnerability detection - shows ML deployment capability.