Creating Tailored Programming Languages ​​for Efficient Visual AI Systems | MIT News

A single photo offers insight into the creator’s world – their interests and feelings about a topic or space. But what about the creators behind the technologies that help make these images possible?

Jonathan Ragan-Kelley, an associate professor in the Department of Electrical Engineering and Computer Science at MIT, is one of these people, having developed everything from visual effects tools in films to the Halide programming language, widely used in the industry for photo editing and processing is used. A researcher at the MIT-IBM Watson AI Lab and the Computer Science and Artificial Intelligence Laboratory, Ragan-Kelley specializes in high-performance, domain-specific programming languages ​​and machine learning that enable 2D and 3D graphics, visual effects, and computational photography.

“The biggest focus of our research is developing new programming languages ​​that make it easier to write programs that run really efficiently on the increasingly complex hardware that’s in your computer today,” says Ragan-Kelley. “If we want to continue to increase the computing power we can actually use for real-world applications – from graphics and visual computing to AI – we need to change the way we program.”

Find a middle ground

Over the past two decades, chip designers and programming engineers have experienced a slowdown Moore’s law and a significant shift from general-purpose computing on CPUs to more diverse and specialized computing and processing units such as GPUs and accelerators. With this transition comes a trade-off: the ability to run general code slightly slower on CPUs, for faster, more efficient hardware, which requires heavy customization of the code and must be mapped to it with bespoke programs and compilers. Newer hardware with improved programming can better support applications such as high-bandwidth cellular interfaces, decoding highly compressed video for streaming, and graphics and video processing on power-constrained cell phone cameras, just a few applications.

“Our work is all about unlocking the power of the best hardware we can build to deliver as much computing power and efficiency as possible for these types of applications, in ways that traditional programming languages ​​cannot. “

To achieve this, Ragan-Kelley divides his work into two strands. First, it sacrifices generality to capture the structure of specific and important computational problems and exploits this for better computational efficiency. This is evident in the image processing language Halide, which he helped develop and which helped transform the image editing industry in programs such as Photoshop. Since it was specifically developed for fast processing of dense, regular number fields (tensors), it is also well suited for neural network calculations. The second focus is on automation, particularly how compilers map programs to hardware. One such project with the MIT-IBM Watson AI Lab uses Exo, a language developed in Ragan-Kelley’s group.

Over the years, researchers have worked persistently to automate coding with compilers, which can be a black box; However, there is still a great need for explicit control and tuning by performance engineers. Ragan-Kelley and his group are developing methods that incorporate both techniques and balance trade-offs to achieve effective and resource-efficient programming. At the heart of many high-performance programs, such as video game engines or cell phone camera processing, are state-of-the-art systems, mostly manually optimized by human experts in simple, detailed languages ​​such as C, C++, and assembler. This is where engineers make specific decisions about how the program runs on the hardware.

Ragan-Kelley points out that programmers can choose “very laborious, very unproductive, and very insecure low-level code” that could lead to bugs, or “safer, more productive, higher-level programming interfaces” that have the capability This missing in a compiler makes fine adjustments to how the program is executed and usually delivers lower performance. So his team is trying to find a middle ground. “We’re trying to figure out how to control the key problems that human performance engineers want to control,” says Ragan-Kelley. “So we’re trying to build a new class of languages ​​that we call user-schedulable languages ​​that provide safer and higher-level handles to control what the compiler does or how the program is optimized.”

Unlocking hardware: high-level and underserved pathways

Ragan-Kelley and his research group are addressing this problem with two main areas of work: They apply machine learning and modern AI techniques to automatically generate optimized schedules and an interface to the compiler to achieve better compiler performance. Another uses “exocompilation,” which he is working on with the lab. He describes this method as a way to “turn the compiler on its head,” with a skeleton compiler with controls for human guidance and customization. Additionally, his team can add its custom schedulers, which can help target specific hardware like machine learning accelerators from IBM Research. Applications for this work run the gamut: computer vision, object recognition, speech synthesis, image synthesis, speech recognition, text generation (large language models), etc.

A comprehensive project of his with the lab goes a step further and approaches the work through a systems lens. Led by his advisor and laboratory intern William Brandon and in collaboration with laboratory researcher Rameswar Panda, Ragan-Kelley’s team is rethinking large language models (LLMs) and finding ways to slightly change the model’s computation and programming architecture so that transformer-based models can be based on AI hardware can run more efficiently without sacrificing accuracy. According to Ragan-Kelley, their work departs from standard thinking in significant ways, with potentially large benefits from cost reductions, improved features, and/or shrinking the LLM so that it requires less memory and can run on smaller computers.

It’s this more avant-garde thinking when it comes to computing efficiency and hardware that sets Ragan-Kelley apart and sees value in it, especially in the long term. “I think there are areas [of research] that need to be pursued but are well established or obvious or so conventional that many people either already pursue them or will pursue them,” he says. “We try to find ideas that have a big impact to practically impact the world, and at the same time are things that wouldn’t necessarily happen, or I think are undervalued relative to their potential by the rest of the community . ”

The course he teaches now, 6.106 (Software Performance Engineering), is an example of this. About 15 years ago, there was a shift from single processors to multiprocessors in a device, which led to many academic programs starting to teach parallelism. But as Ragan-Kelley explains, MIT recognized the importance of students not only understanding parallelism, but also optimizing memory and using specialized hardware to achieve the best possible performance.

“By changing the way we program, we can unlock the computing potential of new machines and enable people to continue to rapidly develop new applications and new ideas capable of taking advantage of this increasingly complicated and sophisticated hardware .”

Source link