Inside Ruby Debuggers: TracePoint, Instruction Sequence, and CRuby API
Hello, Ruby developers!Debugging is a key part of software development, but most developers use debuggers without knowing how they actually work. The RubyMine team has spent years developing debugging tools for Ruby, and we want to share some of the insights weve gained along the way.In this post, well explore the main technologies behind Ruby debuggers TracePoint, Instruction Sequence, and Rubys C-level debugging APIs.Well begin with TracePoint and see how it lets debuggers pause code at key events. Then well build a minimal debugger to see it in action. Next, well look at Instruction Sequences to understand what Rubys bytecode looks like and how it works with TracePoint. Finally, well briefly cover Rubys C-level APIs and the extra power they offer.This blog post is the second in a series based on the Demystifying Debuggers talk by Dmitry Pogrebnoy, RubyMine Team Leader, presented at EuRuKo 2024 and RubyKaigi 2025. If you havent read the first post yet, its a good idea to start there. Prefer video? You can also watch the original talk here.Ready? Lets start!The core technologies behind any Ruby debuggerBefore diving into the debugger internals, its essential to understand the two core technologies that make Ruby debugging possible: TracePoint and Instruction Sequence. Regardless of which debugger you use, they all rely on these fundamental features built into Ruby itself. In the following sections, well explore how each of them works and why theyre so important.TracePoint: Hooking into Code ExecutionLets begin with TracePoint, a powerful instrumentation technology introduced in Ruby 2.0 back in 2013. It works by intercepting specific runtime events such as method calls, line executions, or exception raises and executing custom code when these events occur. TracePoint works in almost any Ruby context, and it works well with Thread and Fiber. However, it currently has limited support for Ractor.Lets take a look at the example and see how TracePoint works.def say_hello puts "Hello Ruby developers!"endTracePoint.new(:call) do |tp| puts "Calling method '#{tp.method_id}'"end.enablesay_hello# => Calling method 'say_hello'# => Hello Ruby developers!In this example, we have a simple say_hello method containing a puts statement, along with a TracePoint that watches events of the call type. Inside the TracePoint block, we print the name of the method being called using method_id. Looking at the output in the comments, we can see that our TracePoint is triggered when entering the say_hello method, and only after that do we see the actual message printed by the method itself.This example demonstrates how TracePoint lets you intercept normal code execution at specific points where special events occur, allowing you to execute your own custom code. Whenever your debugger stops on a breakpoint, TracePoint is in charge. This technology is valuable for more than just debugging. It is also used in performance monitoring, logging, and other scenarios where gaining runtime insights or influencing program behavior is necessary.Building the simplest Ruby debugger with TracePointWith just TracePoint technology, you can build what might be the simplest possible Ruby debugger youll ever see.def say_hello puts "Hello Ruby developers!"endTracePoint.new(:call) do |tp| puts "Call method '#{tp.method_id}'" while (input = gets.chomp) != "cont" puts eval(input) endend.enablesay_helloThis is almost the same code as in the TracePoint example, but this time the TracePoint code body is slightly changed.Lets examine whats happening here. The TracePoint block accepts user input via gets.chomp, evaluates it in the current context using the eval method, and prints the result with puts. Thats really all there is to it a straightforward and effective debugging mechanism in just a few lines of code.This enables one of the core features of a debugger the ability to introspect the current program context on each method invocation and modify the state if needed. You can, for example, define a new Ruby constant, create a class on the fly, or change the value of a variable during execution. Simple and powerful, right? Try to run it by yourself!Clearly, this isnt a complete debugger it lacks exception handling and many other essential features. But when we strip away everything else and look at the bare bones, this is the fundamental mechanism that all Ruby debuggers are built upon.This simple example demonstrates how TracePoint serves as the foundation for Ruby debuggers. Without TracePoint technology, it would be impossible to build a modern Ruby debugger.Instruction Sequence: Rubys bytecode revealedAnother crucial technology for Ruby debuggers is Instruction Sequence.Instruction Sequence, or iseq for short, represents the compiled bytecode that the Ruby Virtual Machine executes. Think of it as Rubys assembly language a low-level representation of your Ruby code after compilation into bytecode. Since its closely tied to the Ruby VM internals, the same Ruby code can produce a different iseq in different Ruby versions, not just in terms of instructions but even in their overall structure and relationships between different instruction sequences.Instruction Sequence provides direct access to the low-level representation of Ruby code. Debuggers can leverage this feature by toggling certain internal flags or even modifying instructions in iseq, effectively altering how the program runs at runtime without changing the original source code.For example, a debugger might enable trace events on a specific instruction that doesnt have one by default, causing the Ruby VM to pause when that point is reached. This is how breakpoints in specific language constructions and stepping through chains of calls work. The ability to instrument bytecode directly is essential for building debuggers that operate transparently, without requiring the developer to insert debugging statements or modify their code in any way.Lets take a look at how to get an Instruction Sequence in Ruby code.def say_hello puts "Hello Ruby developers !"endmethod_object = method(:say_hello)iseq = RubyVM::InstructionSequence.of(method_object)puts iseq.disasmLets examine this code more closely. First, we have our familiar say_hello method containing a puts statement. Then, we create a method object from it using method(:say_hello). Finally, we get the Instruction Sequence for this method and print out its human-readable form using disasm. This lets us peek under the hood and see the actual bytecode instructions that Ruby will execute.Lets examine the output and see what it looks like.== disasm: #<ISeq:say_hello@iseq_example.rb:1 (1,0)-(3,3)>0000 putself ( 2)[LiCa]0001 putchilledstring "Hello Ruby developers !"0003 opt_send_without_block <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>0005 leave ( 3)[Re]The first line shows metadata about our Ruby entity. Specifically, the say_hello method defined in iseq_example.rb with a location range (1,0)-(3,3). Below that are the actual instructions that the Ruby VM will execute. Each line represents a single instruction, presented in a human-readable format. You can easily spot the Hello Ruby developers ! string argument preserved exactly as it appears in the source code, without any encoding or decoding complexity, even with non-ASCII symbols. Such transparency makes it easier for you to understand whats happening at the bytecode level.Instruction Sequence plays a critical role in Ruby debugging by marking key execution points in the bytecode. In bracket notation in the output, you can notice markers like Li for line events, Ca for method calls, and Re for returns. These markers tell the Ruby VM when to emit runtime events. TracePoint relies on these markers to hook into the running program it listens for these events and steps in when they happen. This tight connection between two technologies is what makes it possible for debuggers to pause execution and inspect the state.Going deeper: Rubys C-level debugging APISo far, weve looked at the two core technologies behind Ruby debuggers TracePoint and Instruction Sequence. These are enough to build a working Ruby debugger. However, if you want to implement advanced features like those offered by RubyMine, such as smart stepping or navigating back and forth through the call stack, TracePoint and Instruction Sequence alone wont cut it. To support such capabilities, you need to go a level deeper and tap into the low-level debugging APIs provided by Ruby itself.CRuby exposes a number of internal methods that fill the gaps left by the public Ruby APIs. These methods are defined in C headers such as vm_core.h, vm_callinfo.h, iseq.h, and debug.h, among others. These internal interfaces can unlock powerful capabilities that go beyond whats possible with the public API, but they come with important trade-offs.Since they are specific to CRuby, debuggers using them wont work with other implementations like JRuby or TruffleRuby. Another downside is that these APIs are not public or stable across Ruby versions. Even minor updates can break them, which means any debugger depending on these methods needs constant attention to keep up with Rubys changes. Still, its worth exploring a few of these internal methods to get a better idea of what this low-level API looks like and what it provides for debugger tools.Lets start with rb_tracepoint_new(...):VALUE rb_tracepoint_new(VALUE target_thread_not_supported_yet, rb_event_flag_t events, void (*func)(VALUE, void *), void *data);This method works like creating a trace point in Ruby code, but with more flexibility for advanced use. Its especially helpful for low-level debuggers written as C extensions that need deeper access to the Ruby VM. In the RubyMine debugger, this approach allows more precise control over when and where to enable or disable trace points, which is essential for implementing smart stepping.Another useful method is rb_debug_inspector_open(...):VALUE rb_debug_inspector_open(rb_debug_inspector_func_t func, void *data);This C-level API lets you inspect the call stack without changing the VM state. The func callback receives a rb_debug_inspector_t struct, which provides access to bindings, locations, instruction sequences, and other frame details. In the RubyMine debugger, its used to retrieve the list of frames and implement the ability to switch between them back and forth on the call stack when the program is suspended by the debugger. Without this API, frame navigation and custom frame inspection in Ruby would be much more difficult.The final example is a pair of methods for working with iseq objects. The method rb_iseqw_to_iseq(...) converts an iseq from a Ruby value to a C value, while rb_iseq_original_iseq(...) converts it back from C to Ruby. These let Ruby debuggers switch between Ruby and C-extension code when precise, low-level control is needed. In the RubyMine debugger, they are actively used in the implementation of smart stepping, helping determine which code should be stepped into during debugging.These low-level APIs offer powerful tools for building advanced debugging features the kind that arent possible with TracePoint and Instruction Sequence alone. But they come with a cost: platform lock-in to CRuby and a high maintenance burden due to their instability across Ruby versions. Despite that, they remain essential for debuggers that need deep integration with the Ruby VM.ConclusionIn this post, we explored the foundational technologies that power Ruby debuggers TracePoint and Instruction Sequence. These two components form the basis for how modern Ruby debuggers observe and interact with running Ruby code. TracePoint enables hooks into specific runtime events like method calls and line execution, while Instruction Sequence provides low-level access to the compiled Ruby VM bytecode.We also took a brief look at how low-level CRuby C APIs exert even more precise control over code execution, offering insight into how debuggers like RubyMine implement advanced features. While we didnt dive into full debugger implementations here, this foundation lays the groundwork for understanding how these tools operate.Stay tuned in a future post, well go further into how modern debuggers are built on top of this foundation.Happy coding, and may your bugs be few and easily fixable!The RubyMine team
0 Comments 0 Shares 4 Views