The document discusses learning Ruby by reading its source code. It provides an overview of Ruby's basic object structure, with every object having an RBasic struct containing flags and a klass pointer. It describes the various structs used to represent different Ruby object types like integers, floats, strings, arrays, and classes. It details how Ruby uses singleton classes to implement class and instance methods. The document also explores Ruby's approach to class inheritance and module inclusion by examining how method lookup and class hierarchies are represented internally. It encourages spelunking the C source code to better understand how Ruby works under the hood.
6. Every object has an RBasic
struct RBasic {
VALUE flags;
VALUE klass;
}
7. flags stores information like whether the object is
frozen, tainted, etc.
struct RBasic {
VALUE flags;
VALUE klass;
}
It’s mostly internal stuff that you don’t think about
very often.
8. klass is a pointer to the class of the object
struct RBasic {
VALUE flags;
VALUE klass;
}
(or singleton class, which we’ll talk about later)
9. ...but what’s a VALUE?
struct RBasic {
VALUE flags;
VALUE klass;
}
10. VALUE is basically used as a void pointer.
typedef uintptr_t VALUE;
It can point to any ruby value.
12. This is a Float.
struct RFloat {
struct RBasic basic;
double float_value;
}
13. Every type of object, including Float, has an RBasic.
struct RFloat {
struct RBasic basic;
double float_value;
}
14. And then, after the RBasic, type-specific info.
struct RFloat {
struct RBasic basic;
double float_value;
}
15. Ruby has quite a few types.
Each of them has their own
type-specific data fields.
16. But given a ‘VALUE’, we don’t
know which type we have.
How does ruby know?
17. Every object has an RBasic
struct RBasic {
VALUE flags;
VALUE klass;
}
And the object type is stored inside flags.
18. Given an object of unknown
type...
VALUE a
struct αѕgєנqqωσ {
struct RBasic basic;
ιηт נѕƒкq; // ???
ƒנє σтн¢נє; // ???
}
We can extract the type from ‘basic’, which is
guaranteed to be the first struct member.
19. e.g. if the type is T_STRING,
struct RString {
struct RBasic basic;
union {
struct {
long len;
...
then we know it’s a `struct RString`.
20. Every* type corresponds to a
struct type, which ALWAYS
has an RBasic as the first
struct member.
* exceptions for immediate values
21. There are custom types for
primitives, mostly to make
them faster.
24. T_OBJECT (struct RObject)
is pretty interesting.
It’s what’s used for instances
of any classes you define, or
most of the standard library.
25. TL;DR: Instance Variables.
struct RObject {
struct RBasic basic;
long numiv;
VALUE *ivptr;
struct st_table *iv_index_tbl;
}
This makes sense; an instance of a class has its own
data, and nothing else.
26. It stores the number of instance variables
struct RObject {
struct RBasic basic;
long numiv;
VALUE *ivptr;
struct st_table *iv_index_tbl;
}
27. struct RObject {
struct RBasic basic;
long numiv;
VALUE *ivptr;
struct st_table *iv_index_tbl;
}
And a pointer to a hashtable containing the
instance variables
28. This is a shortcut to the class variables of the
object’s class.
struct RObject {
struct RBasic basic;
long numiv;
VALUE *ivptr;
struct st_table *iv_index_tbl;
}
You could get the same result by looking it up on
basic.klass (coming up right away)
29. This definition is actually slightly simplified. I
omitted another performance optimization for
readability.
struct RObject {
struct RBasic basic;
long numiv;
VALUE *ivptr;
struct st_table *iv_index_tbl;
}
Go read the full one after this talk if you’re so
inclined!
31. Classes have instance variables (ivars),
class variables (cvars), methods, and a superclass.
struct RClass {
struct RBasic basic;
rb_classext_t *ptr;
struct st_table *m_tbl;
struct st_table *iv_index_tbl;
}
32. This is where the methods live.
struct RClass {
struct RBasic basic;
rb_classext_t *ptr;
struct st_table *m_tbl;
struct st_table *iv_index_tbl;
}
st_table is the hashtable implementation ruby
uses internally.
33. Class variables live in iv_index_tbl.
struct RClass {
struct RBasic basic;
rb_classext_t *ptr;
struct st_table *m_tbl;
struct st_table *iv_index_tbl;
}
35. The superclass, instance variables, and constants
defined inside the class.
struct rb_classext_struct {
VALUE super;
struct st_table *iv_tbl;
struct st_table *const_tbl;
}
36. It ends up looking kinda like:
struct RClass {
struct RBasic basic;
VALUE super;
struct st_table *iv_tbl;
struct st_table *const_tbl;
struct st_table *m_tbl;
struct st_table *iv_index_tbl;
}
...though this isn’t really valid because rb_classext_t
is referred to by a pointer.
39. Same underlying type (struct RClass) as a class
#define RCLASS(obj) (R_CAST(RClass)(obj))
#define RMODULE(obj) RCLASS(obj)
...just has different handling in a few code paths.
44. enum ruby_special_consts {
RUBY_Qfalse = 0,
RUBY_Qtrue = 2,
RUBY_Qnil
= 4,
RUBY_Qundef = 6,
RUBY_IMMEDIATE_MASK = 0x03,
RUBY_FIXNUM_FLAG
= 0x01,
A RUBY_SYMBOL_FLAG a big integer, with a
pointer is basically just
= 0x0e,
number referring to a memory
RUBY_SPECIAL_SHIFT = 8 address.
};
45. enum ruby_special_consts {
RUBY_Qfalse = 0,
RUBY_Qtrue = 2,
RUBY_Qnil
= 4,
RUBY_Qundef = 6,
Remember how a VALUE is mostly a pointer?
RUBY_IMMEDIATE_MASK = 0x03,
These tiny addresses are in the0x01, space
RUBY_FIXNUM_FLAG
= kernel
in a process image, which means they’re
RUBY_SYMBOL_FLAG
= 0x0e,
unaddressable.
RUBY_SPECIAL_SHIFT = 8
};
So ruby uses them to refer to special values.
46. enum ruby_special_consts {
RUBY_Qfalse = 0,
RUBY_Qtrue = 2,
RUBY_Qnil
= 4,
RUBY_Qundef = 6,
RUBY_IMMEDIATE_MASK = 0x03,
RUBY_FIXNUM_FLAG
= 0x01,
Any VALUE equal to 0 is false,0x0e,
2 is true, 4 is
RUBY_SYMBOL_FLAG
=
nil, and 6 is a special value only 8
RUBY_SPECIAL_SHIFT = used internally.
};
47. enum ruby_special_consts {
RUBY_Qfalse = 0, on the principle that
Integers and Symbols work
RUBY_Qtrue = allocated without 4-byte
2,
memory is never
RUBY_Qnil
= 4,
alignment.
RUBY_Qundef = 6,
RUBY_IMMEDIATE_MASK
RUBY_FIXNUM_FLAG
RUBY_SYMBOL_FLAG
RUBY_SPECIAL_SHIFT
};
=
=
=
=
0x03,
0x01,
0x0e,
8
48. enum ruby_special_consts {
Any odd VALUE
RUBY_Qfalse = 0,> 0 is a Fixnum.
RUBY_Qtrue = 2,
An even VALUE not 4,
RUBY_Qnil
= divisible by 4 might be a
RUBY_Qundef =Symbol.
6,
RUBY_IMMEDIATE_MASK
RUBY_FIXNUM_FLAG
RUBY_SYMBOL_FLAG
RUBY_SPECIAL_SHIFT
};
=
=
=
=
0x03,
0x01,
0x0e,
8
63. Class methods
class Foo
def bar
:baz
end
end
class Foo
def self.bar
:baz
end
end
Foo.new.bar
Foo.baz
We know how
this works now.
But how does
this work?