I'm José Wesley

Researcher in compilers and programming languages

I'm a Ph.D

Name: José Wesley de Souza Magalhães
Job: Research
Birth: 02/1996
Residence: Edinburgh, Scotland, UK
Hometown: Ferros, Minas Gerais, Brazil

I'm a Ph.D. Candidate in Computer Science at The University of Edinburgh (UoE). Currently I'm working as a Research Postgraduate in the Institute for Computing and Systems Architecture (ICSA) under the supervision of Professor Michael O'Boyle. My research interests include Compilers and Programming Languages, focusing on Neural-Guided Program Synthesis and Benchmark Validation.

I am currently researching methods to automatically lift legacy code to new, modern, and optimized languages. The goal is to translate programs, or part of them, to domain-specific languages. Lifting enables leverage domain knowledge and execute code on new hardware platforms, therefore, leading to performance gains.

View Resume

My Education

Ph.D. in Computer Science
The University of Edinburgh 2022-NOW
Advisor: Michael O'Boyle

Local: Edinburgh, Scotland, UK
Master's Degree in Computer Science
Federal University of Minas Gerais (UFMG) 2019-2021
Final Article: Automatic Inspection of Program State in an Uncooperative Environment

Advisor: Fernando Magno Quintão Pereira

Local: Belo Horizonte, Minas Gerais, Brazil
Bachelor's Degree in Computer Science
Federal University of Viçosa (UFV) 2015-2018
Final Project: A Flow-Sensitive Approach for Steensgaard's Pointer Analysis

Advisor: Daniel Mendes Barbosa

Local: Florestal, Minas Gerais, Brazil

My Projects

Lifting legacy code to optimized domain-specific languages
View

Project Lifting

Existing software is usually written in general-purpose languages that leverage little the efficiency of the application's domain. Besides that, rewriting code to a new target is a difficult, tedious and error-prone task. Program Lifting is a technique to automatically translate legacy code to modern domain-specific languages (DSLs). Such DSLs embed domain knowledge, which is crucial to optimize programs and leads to big performance improvements. In fact, DSLs have been outperforming general-purpose ones in a variety of domains

In addition to that, lifting assists in the context of the rapid evolution of Computer Architecture by automatically porting existing code databases to new hardware platforms. Emerging hardware is often programmed by domain-specific languages. Therefore, lifting can be used to bridge the gap between legacy code and modern hardware devices by mapping programs, or part of them, to new accelerator oriented DSL.
View
Whiro: Automatic inspection of program state
View

Project Whiro

This project aims to automatically insert verification code into programs. This verification code reports the internal state of a program in a given program point. By the state of the program, we mean the values of the program variables at that point. This technique tracks values of static, stack-allocated, and heap-allocated variables, being able to traverse the memory graph in unsafe languages, like C and C++; and it works in programs transformed by different compiler optimizations. Such an approach can be used for debugging purposes, synthesis of benchmarks, and data visualization.
View
Jotai: Generating synthetic inputs to turn Angha programs into executable
View

Project Jotai Benchmarks

The goal of this project is to create a technique to synthesize inputs to programs, in order to allow such programs to execute. We implement a Clang plugin that analyzes the signature of functions and gathers information about its parameters. Then, our framework instantiate constraint-guided inputs based on the types of the parameters of a function and links them into the program as a library. Using this framework, we will be able to transform the programs created in Project Angha into executable benchmarks, which will enable us to perform many different experiments that involve, for instance, running time and resource usage.
View
Angha: A suite with over one million compilable programs mined from open-source repositories
View

Project Angha

The goal of this project is to design, implement, and validate a general methodology to produce benchmarks for compilers. This approach goes over open-source repositories containing C programs, downloads these programs, splits them into individual functions, and uses Psyche-c, a type reconstructor, to produce compilable bytecodes out of each individual function. To demonstrate that this methodology is useful, we used the benchmarks to improve a machine learning technique and trained a predictive compiler for code size reduction. We have been able to generate binaries over 10% smaller than clang -Oz.
View
Miscellaneous
View
Projects
- Back in 2017, I wrote a book about Julia language, with some college colleagues. It is in Portuguese, actually, but it can be interesting for beginners on that language. View Book
- I participated in the Artifact Evaluation Committe for CGO 2021. View Committe
- I have taught some classes about LLVM metadata in the LLVM Introductory Course promoted by the Compilers Laboratory on YouTube. Check it out!

My Papers

2025

Guided Tensor Lifting
Yixuan Li, José Wesley de Souza Magalhães, Alexander Brauckmann, Michael F.P. O'Boyle, Elizabeth Polgreen
Conference: Programming Language Design and Implementation (PLDI) 2025 View

View Paper
2025

Distinguished Paper

Tensorize: Fast Synthesis of Tensor Programs from Legacy Code using Symbolic Tracing, Sketching and Solving
Alexander Brauckmann, Luc Jaulmes, José Wesley de Souza Magalhães, Elizabeth Polgreen, Michael F.P. O'Boyle
Conference: International Symposium on Code Generation and Optimization (CGO) 2025 View

View Paper
2023

Best Research Paper

C2TACO: Lifting Tensor Code to TACO
José Wesley de Souza Magalhães, Jackson Woodruff, Elizabeth Polgreen, MIchael F. P. O'Boyle
Conference: International Conference on Generative Programming: Concepts & Experiences (GPCE) 2023 View

View Paper
2022

Automatic Inspection of Program State in an Uncooperative Environment
José Wesley de Souza Magalhães, Chunhua Liao, and Fernando Magno Quintão Pereira
Journal: Software: Practice and Experience. 2022; 1-32 View

View Paper
ExeBench: An ML-Scale Dataset of Executable C Functions
Jordi Armengol-Estapé, Jackson Woodruff, Alexander Brauckmann, José Wesley de Souza Magalhães, Michael F. P. O'Boyle.
Conference: International Symposium on Machine Programming (MAPS) 2022 View

View Paper
2021

ANGHABENCH : a Suite with One Million Compilable C Benchmarks for Code-Size Reduction
Anderson Faustino da Silva, Bruno Conde Kind, José Wesley de Souza Magalhães, Jerônimo Nunes Rocha, Breno Campos Ferreira Guimaraes, and Fernando Magno Quintao Pereira.
Conference: International Symposium on Code Generation and Optimization (CGO) 2021 View

View Paper
2019

2nd Best Research Paper

Synthesis of Benchmarks for the C Programming Language by Mining Software Repositories
Breno Campos Ferreira Guimarães, José Wesley Magalhães, Fernando Quintão Pereira, and Anderson Faustino da Silva.
Conference: XXIII Brazilian Symposium on Programming Languages (SBLP) View

View Paper

Contact Me

lattes/5322689552829310

scholar/JoseWesley

[email protected]
[email protected]

github.com/JWesleySM

linkedin/josé-wesley

t.me/jwesleysm

Edinburgh
Scotland, UK

I'm José Wesley

I'm a Ph.D

My Education

Ph.D. in Computer Science

Master's Degree in Computer Science

Bachelor's Degree in Computer Science

My Projects

Lifting legacy code to optimized domain-specific languages

Project Lifting

Whiro: Automatic inspection of program state

Project Whiro

Jotai: Generating synthetic inputs to turn Angha programs into executable

Project Jotai Benchmarks

Angha: A suite with over one million compilable programs mined from open-source repositories

Project Angha

Miscellaneous

Projects

My Papers

2025

Guided Tensor Lifting

2025

Tensorize: Fast Synthesis of Tensor Programs from Legacy Code using Symbolic Tracing, Sketching and Solving

2023

C2TACO: Lifting Tensor Code to TACO

2022

Automatic Inspection of Program State in an Uncooperative Environment

ExeBench: An ML-Scale Dataset of Executable C Functions

2021

ANGHABENCH : a Suite with One Million Compilable C Benchmarks for Code-Size Reduction

2019

Synthesis of Benchmarks for the C Programming Language by Mining Software Repositories

Contact Me