/
name_mangling.txt
148 lines (112 loc) · 5.82 KB
1
2
Here's the name mangling scheme:
3
4
5
6
7
Internal-only (including BASIClib) functions, macros, and variables are
always prepended with a "__". This is to prevent namespace clash with
user-defined BASIC identifiers. As usual, (and not unlike Java naming
conventions) the first word of an identifier should be lowercase, and the
first letter of each extra word in said identifier should be an uppercase.
8
9
10
11
12
13
So, for example, our internal BASIClib API function that blocks until a
specified thread terminates is called "__waitForThreadToDie()".
For BASIC API functions (functionality exposed to user-defined BASIC code),
the function prototypes go like this (for example.) :
14
15
16
17
In BASIC: FUNCTION String$(x as Integer, str as String) As String
In C: PBasicString _vbSiS_string_DC_(int x, PBasicString str);
18
19
20
To explain:
21
22
23
24
25
All APIs start with "_vb".
The next character is the return value. If it's a
SUB (void), the character is 'p' for "procedure". Otherwise, it'll be the type
of data returned. See below.
26
27
The next characters are the arguments. One character per argument. See below.
28
29
30
31
32
33
34
35
The end of the encoded information in the above format is signified by an
underscore character ('_'). After that, the proper name of the function
(like "instr" or "bsave" or whatever) is appended, completing the "mangled"
API's name.
So, in short, it's a modified hungarian notation. A little different, but
VERY detailed.
36
37
38
39
40
41
42
43
'$' chars, like "RTRIM$()", are encoded as "_DC_" (for "dollar
character"), as you can see above. Other potential ones are "_NC_"
('#'), _EC_ ('!'), _PC_ ('%')...
The characters that signify parameters and return value can be:
(Note that case is significant...)
44
45
46
47
48
49
50
51
'b' == "boolean BYVAL"
'i' == "integer BYVAL"
'l' == "long BYVAL"
'f' == "single (float) BYVAL"
'd' == "double BYVAL"
's' == "string BYVAL" (*see note)
'v' == "Variant BYVAL" (*see note)
52
'r' == "Array BYVAL" (*see note)
53
'a' == "Any variable BYVAL" (**see note)
54
55
56
57
58
59
60
'B' == "boolean BYREF"
'I' == "integer BYREF"
'L' == "long BYREF"
'F' == "float BYREF"
'D' == "double BYREF"
'S' == "String BYREF" (*see note)
'V' == "Variant BYREF" (*see note)
61
'R' == "Array BYREF" (*see note)
62
'N' == "Null argument" (****see note)
63
64
65
'1' == "File mode" (*****see note)
'2' == "Access mode" (*****see note)
'3' == "Lock mode" (*****see note)
66
67
68
69
70
71
72
'n' == "n arguments" (******see note)
*Strings, Arrays, and Variants are actually structures, so passing them by
reference (BYREF) passes a pointer to the structure. Passing them by value
(BYVAL) requires the parser/compiler create an exact copy of the data, and
pass a pointer to the clone of that data. Before return, the called function
must clean up BYVAL structures like this. Intrinsics are just popped off the
73
74
stack, and the called function does nothing directly with them. In fact, the
way C works, the calling function doesn't pop the stack or arguments at all.
75
76
77
78
You can see why we used 'S' and not 's' in the above example. Intrinsic data
types passed BYVAL and BYREF work just like their equivalents in C, and are
typically passed by value in the average API call, even though BASIC-coded
functions are BYREF by default.
79
80
81
**"Any" variable type is used for things like "pos(x)", where x is
simply any filler. That's stupid, but that's BASIC. Whatever you pass
82
for these functions is cast to a (void *), regardless of what sort of
83
data it really is, and sent along.
84
85
86
87
88
****API calls like "locate" can actually SKIP arguments, like
[locate x, y, , 3] Yikes. The Null argument is made for the
blank argument.
89
*****These are for the OPEN calls. It's ugly. I'm sorry.
90
91
92
******Functions that accept variable arguments, like CLOSE and ERASE, will be
broken down into multiple calls by the parser/compiler:
93
94
CLOSE 1, 5, 7
95
96
becomes:
97
98
99
100
_vbpin_close(1);
_vbpin_close(5);
_vbpin_close(7);
101
102
103
The identifier to signify that a given argument is repeated a variable amount
of times is 'n'. Mathematically, 'n' is frequently used to specify iterations.
104
105
106
107
108
109
110
111
112
'n' must follow another parameter type character; the data type it follows is
the one that the parser/compiler will expect every argument in the variable
list to be.
'n' must be the last character in the information encoding. That is, after
this character, the next one MUST be '_' followed by the function's actual
name.
113
So, CLOSE has two forms:
114
115
116
void _vbp_close(); /* close all files; takes no argument */
void _vbpin_close(__integer fileNum); /* close specific files */
117
118
119
The parser/compiler will figure out which function call is appropriate. If,
in this case, there are arguments to a CLOSE statement, and each are castable
120
to type (__integer), then the parser/compiler will generate as many calls to
121
122
123
124
vbpin_close() as there are arguments to the BASIC equivalent. This is easier
to implement cleanly than using C's variable argument facilities, probably
faster in the long run, and less bug prone in both BASIClib and end-user
compiled code.
125
126
127
128
129
130
131
132
133
134
135
136
137
138
APIs using the 'n' char are expected to have at least one argument. To have
no arguments be valid, "overload" the API with a non-variable argument
version. For example, if vbp_close() didn't exist, "CLOSE" would not be
considered a valid BASIC command unless it had at least one file number
specified with it. This allows for both the ERASE command, which need
at least one argument, and the CLOSE command, which does something
different for no arguments... etc... (That, and by this method, you'd have
to overload your function for zero argument versions since it breaks it down
into multiple calls...what's the parser/compiler going to do? Break it down
into ZERO calls? That's obviously unwanted behavior.)
Other stuff:
139
140
141
142
143
This system allows us to have "overloaded" functions, and gives the
parser/compiler a more sane way than a giant conditional block to
determine if a function exists, with a few exceptions, such as one of
MID$()'s variants.
144
145
Please refer to special_cases.txt for possible exceptions to these rules...
146
147
/* end of name_mangling.txt ... */