A neat trick that Factor provides is the ability to disassemble functions into the machine code that is generated by the compiler. In 2008, Slava Pestov created a disassembler, and has improved it a bit since then (switching to udis86 for its implementation).
Constant Folding
The compiler performs constant folding, using the compiler.tree.debugger vocabulary, you can output the optimized form of a quotation:
( scratchpad ) [ 2 2 + ] optimized. [ 4 ]
Using the disassembler, you can see the machine code this generates:
( scratchpad ) [ 2 2 + ] disassemble 011c1a5530: 4983c608 add r14, 0x8 011c1a5534: 49c70640000000 mov qword [r14], 0x40 011c1a553b: c3 ret 011c1a553c: 0000 add [rax], al 011c1a553e: 0000 add [rax], al
Local Variables
One of the questions that comes up sometimes is whether local variables affect performance. We can examine two words that add numbers together, one using locals and one just using the stack:
( scratchpad ) : foo ( x y -- z ) + ; ( scratchpad ) :: bar ( x y -- z ) x y + ;
The "optimized output" looks a little different:
( scratchpad ) \ foo optimized. [ + ] ( scratchpad ) \ bar optimized. [ "COMPLEX SHUFFLE" "COMPLEX SHUFFLE" R> + ]
But, the machine code that is generated is identical:
( scratchpad ) \ foo disassemble 01115de7b0: 488d1d05000000 lea rbx, [rip+0x5] 01115de7b7: e9e49439ff jmp 0x110977ca0 (+) 01115de7bc: 0000 add [rax], al 01115de7be: 0000 add [rax], al ( scratchpad ) \ bar disassemble 01115ef620: 488d1d05000000 lea rbx, [rip+0x5] 01115ef627: e9748638ff jmp 0x110977ca0 (+) 01115ef62c: 0000 add [rax], al 01115ef62e: 0000 add [rax], al
Dynamic Variables
Another frequently used feature is dynamic variables, implemented by the namespaces
vocabulary. For example, the definition of the print
word looks for the current value of the output-stream
variable and then calls stream-print
on it:
( scratchpad ) \ print see USING: namespaces ; IN: io : print ( str -- ) output-stream get stream-print ; inline
The optimized output inlines the implementation of get:
( scratchpad ) [ "Hello, world" print ] optimized. [ "Hello, world" \ output-stream 0 context-object assoc-stack stream-print ]
You can inspect the machine code generated, seeing references to the factor words that are being called:
( scratchpad ) [ "Hello, world" print ] disassemble 011c0c6c40: 4c8d1df9ffffff lea r11, [rip-0x7] 011c0c6c47: 6820000000 push dword 0x20 011c0c6c4c: 4153 push r11 011c0c6c4e: 4883ec08 sub rsp, 0x8 011c0c6c52: 4983c618 add r14, 0x18 011c0c6c56: 48b8dbc5a31a01000000 mov rax, 0x11aa3c5db 011c0c6c60: 498946f0 mov [r14-0x10], rax 011c0c6c64: 498b4500 mov rax, [r13+0x0] 011c0c6c68: 488b4040 mov rax, [rax+0x40] 011c0c6c6c: 498906 mov [r14], rax 011c0c6c6f: 48b86c91810e01000000 mov rax, 0x10e81916c 011c0c6c79: 498946f8 mov [r14-0x8], rax 011c0c6c7d: e8de4e36ff call 0x11b42bb60 (assoc-stack) 011c0c6c82: 4883c418 add rsp, 0x18 011c0c6c86: 488d1d05000000 lea rbx, [rip+0x5] 011c0c6c8d: e94e5264ff jmp 0x11b70bee0 (stream-print) 011c0c6c92: 0000 add [rax], al 011c0c6c94: 0000 add [rax], al 011c0c6c96: 0000 add [rax], al 011c0c6c98: 0000 add [rax], al 011c0c6c9a: 0000 add [rax], al 011c0c6c9c: 0000 add [rax], al 011c0c6c9e: 0000 add [rax], al
1 comment:
"The compiler performs constant folding, using the compiler.tree.debugger vocabulary, you can output the optimized form of a quotation:"
Actually it's compiler.tree.optimizer. The debugger just prints optimizer output, it is not used as part of normal execution.
Post a Comment