I am a software engineer at Isovalent, working on cloud-native networking, security, and observability using eBPF.
Short Bio
I graduated from the School of Electrical and Computer Engineering at National Technical University of Athens in 2004, where I also did my PhD. During my PhD I mostly worked on performance optimization for scientific applications. My main focus was sparse matrices and the sparse matrix-vector multiplication kernel (spmv). I developed storage formats for compressing sparse matrices, including CSX which was further developed and is now evolved to a sparse kernel optimization library called SparseX.
After my PhD, I worked as a post-doctoral researcher at the Systems Group of ETH Zurich with prof. Timothy Roscoe. There, together with the rest of the Barrelfish team, we explored many aspects of modern operating system design aiming to address the challenges of current and future multicore architectures (scalability, heterogeneity, and hardware complexity). I've also worked on Dragonet, a network stack that aims to deal with the complexities of modern NICs as a primary concern.
I was a visiting scientist at IBM research, Zurich. Among other things, I worked on: SALSA, a host translation layer for storage devices such as Flash SSDs, and SMR disk drives; uDepot, a key-value store targeting fast SSDs; cmnnc, a compiler that maps neural network inference computations to a computational memory accelerator.
Selected publications
(for a full list see my google scholar page)
- Toward a Better Understanding and Evaluation of Tree Structures on Flash SSDs. D. Didona, N. Ioannou, R. Stoica, K Kourtis, VLDB 2021 (pdf).
- Compiling Neural Networks for a Computational Memory Accelerator. K. Kourtis, M. Dazzi, N. Ioannou, T. Grosser, A. Sebastian, E.Eleftheriou, SPMA 2020 (paper on arxiv, slides, blog post, source).
- Reaping the performance of fast NVM storage with uDepot. K. Kourtis, N. Ioannou, I. Koltsidas, FAST 2019 (FAST page, paper, source).
- Parallel training of linear models without compromising convergence. N. Ioannou, C. Dünner, K. Kourtis, T. Parnell, MLSys/NeurIPS 2018 (arXiv).
- Elastic CoCoA: Scaling In to Improve Convergence. M. Kaufmann, T. Parnell, K. Kourtis, MLSys/NeurIPS 2018 (arXiv).
- Mira: Sharing Resources for Distributed Analytics at Small Timescales. M. Kaufmann, K. Kourtis, A. Schuepbach, Martina Zitterbart, IEEE BigData 2018 (paper).
- Elevating Commodity Storage with the SALSA Host Translation Layer. N. Ioannou, K. Kourtis, I. Koltsidas, MASCOTS 2018 (paper, presentation).
- The HCl Scheduler: Going All-in on Heterogeneity. M. Kaufmann, K. Kourtis, HotCloud 2017 (paper, HotCloud page).
- FlashNet: Flash/Network Stack Co-Design. A. Trivedi, N. Ioannou, B. Metzler, P. Stuedi, J. Pfefferle, I. Koltsidas, K. Kourtis, T. R. Gross, SYSTOR 2017 (ACM page).
- Intelligent NIC Queue Management in the Dragonet Network Stack. K. Kourtis, P. Shinde, A. Kaufmann, T. Roscoe, TRIOS 2015 (paper, slides, source).
- Not your parents' physical address space. S. Gerber, G. Zellweger, R. Achermann, K. Kourtis, T. Roscoe, D. Milojicic, HotOS 2015 (paper, HotOS page).
- Decoupling Cores, Kernels, and Operating Systems. G. Zellweger, S. Gerber, K. Kourtis, T. Roscoe, OSDI 2014 (paper, OSDI page).
- Cosh: Clear OS Data Sharing In An Incoherent World. A. Baumann, C. Hawblitzel, K. Kourtis, T. Harris, T. Roscoe, TRIOS 2014 (paper, TRIOS page).
- Modeling NICs with Unicorn. P. Shinde, A. Kaufmann, K. Kourtis, T. Roscoe, PLOS/SOSP 2013 (paper, presentation).
- Sequences for parallel programming. K. Kourtis, Technical Report (report).
- Towards a compiler/runtime synergy to predict the scalability of parallel loops. G. Chatzopoulos, K. Kourtis, N. Koziris, G. Goumas, MuCoCoS/PACT 2013 (paper).
- Improving the performance of the symmetric sparse matrix-vector multiplication in multicore. T. Gkountouvas, V. Karakasis, K. Kourtis, G. Goumas, and N. Koziris, IPDPS 2013 (paper).
- An extended compression format for the optimization of sparse matrix-vector multiplication. V. Karakasis, T. Gkountouvas, K. Kourtis, G. Goumas, and N. Koziris, TPDS 2013 (paper).
- CSX: An extended compression format for the optimization of sparse matrix-vector multiplication. K. Kourtis, V. Karakasis, G. Goumas and N. Koziris, PPoPP 2011 (paper, presentation, source, errata).
- Data compression techniques for performance improvement of memory-intensive applications on shared memory architectures. K.Kourtis, PhD thesis (thesis).
- Exploiting Compression Opportunities to Improve SpMxV Performance on Shared Memory Systems. K. Kourtis, G. Goumas and N. Koziris, ACM TACO 2010 (paper).
- Improving the Performance of Multithreaded Sparse Matrix-Vector Multiplication using Index and Value Compression. K. Kourtis, G. Goumas and N. Koziris, ICPP 2008 (paper, presentation).
- Optimizing Sparse Matrix-Vector Multiplication Using Index and Value Compression. K. Kourtis and G. Goumas and N. Koziris, CF 2008 (paper, presentation).
- Runtime Code Generation for Huffman Decoders. K. Kourtis, Technical Report (report).
- Performance evaluation of the sparse matrix-vector multiplication on modern architectures. G. Goumas, K. Kourtis, N. Anastopoulos, V. Karakasis and N. Koziris, SCJ 2009 (paper).
- Global-scale peer-to-peer file services with DFS. A. Chazapis, G. Tsoukalas, G. Verigakis, K. Kourtis, A. Sotiropoulos and N. Koziris, GRID 2007 (paper).
- Exploring the Performance Limits of Simultaneous Multithreading for Memory Intensive Applications. E. Athanasaki, N. Anastopoulos, K. Kourtis and N. Koziris, SCJ 2008 (paper).
Teaching (at ETH)
- Spring 2014: Parallel Programming
- Fall 2013: Supporting Parallelism in Operating Systems and Programming Languages.
- Fall 2012: Supporting Parallelism in Operating Systems and Programming Languages.
- Spring 2012: Seminar: Parallel Language Systems