SpacePony is an experimental fork of the Pony programming language. The goal of the fork is to improve the FFI capabilities and add more systems programming language features.
- Quick start
- Breaking changes from original Pony
- List of additions/changes
- Future directions
- The original Pony README is here with instructions how to compile
SpacePony currently doesn't have any official binary releases, however it is easy to clone this repo and build the compiler.
- First clone this repo in any subdirectory.
By default SpacePony uses Clang as the compiler. Also Cmake and Python3 is needed.
- In the base directory type:
- make libs build_flags="-j6" (this will build associated libraries such as LLVM, this will take a while. Adjust the -j6 option to the number of cores you want to use)
- make configure
- make build
By default SpacePony uses the MSVC compiler. Also Cmake and Python3 is needed. Everything can be installed using MS Visual Studio. Open a Powershell terminal in order to run the make script, simplest is to open the "Developer PowerShell for VS 202x" where all the compiler stuff is added to the path.
- In the base directory type:
- .\make.ps1 libs (this will build associated libraries such as LLVM, this will take a while)
- .\make.ps1 configure
- .\make.ps1 build
The compiled output including the ponyc compiler is located in the build/release directory.
Did I miss anything? This guide will tell you more Building from source
-
While SpacePony is compatible with original Pony, it is inevitable that SpacePony and Pony will diverge more and more. Several keywords have been added which might be used in Pony, cannot be used in SpacePony. This is a list of breaking changes that you might need to change in order to adapt Pony source code to SpacePony.
-
The following keywords have been added. They are either in use or reserved for future use.
- enum
- comptime
- asm
- tuple
- nhb
- offsetof
- sizeof
-
In the class
Iterin the packageitertools, the methodenumwas renamed toenumeratebecauseenumis a reserved keyword in SpacePony.
-
-
Added a build flag
spaceponyin order to identify SpacePony.ifdef spacepony then env.out.print("This is spacepony") else env.out.print("This is not spacepony") end
-
The Pointer type can be used everywhere not only as FFI function arguments.
-
It can be initialized using from_usize(addr: USize) in order to initialize a Pointer with any desired number.
var my_ptr = Pointer[None].from_usize(0x12345678)
-
It is possible to obtain the Pointer value as a USize with the usize method. Useful for printing the pointer value for example.
env.out.print("Address is " + Format.int[USize](ptr.usize(), FormatHex))
-
The convert method is made public in order to make it possible to cast one pointer type to another.
var ptr1: Pointer[MyStruct1] = ... var ptr2: Pointer[MyStruct2] = ptr1.convert[MyStruct2]()
-
Added to_reftype method to convert the Pointer to the underlying reference type. This is equivalent to the apply method of the NullablePointer type.
var ptr: Pointer[MyStruct] = ... var my_struct: MyStruct = MyStruct_ try my_struct_: MyStruct = ptr.to_reftype()? else ... end
-
Added to_reftype_no_check method to convert the Pointer to the underlying reference type without any null pointer check.
var ptr: Pointer[MyStruct] = ... var my_struct: MyStruct = ptr.to_reftype_no_check()
-
Added pointer_at_index method in order to obtain a pointer at a specific index. Note that this depends on the type of Pointer.
var ptr: Pointer[U32] = ... var ptr2 = ptr.pointer_at_index(3) // Gives a new pointer with a byte offset of ptr + 4 * 3
-
Added from_reftype constructor in order to create a pointer from a class or struct.
var ptr: Pointer[MyStruct].from_reftype(my_struct)
-
Added from_any constructor in order to create a pointer from anything of the desired type. This is like a cast assign.
var ptr: Pointer[U32].from_any[MyStruct](my_struct)
-
A class or a struct can be implicitly converted to a Pointer. A Pointer[None] will accept any class or struct type.
struct S ... let s: S = S var ps: Pointer[S] = s var ps2: Pointer[None] = s
-
addressof can be used everywhere
-
It is possible to use addressof of an FFI function. The type is a bare lambda. This is useful for populating bare lambdas in structs to emulate C++ abstract classes if it is necessary to initialize them in Pony.
use @FFI_Func[None](x: I32, y: I32) ... var f = addressof @FFI_Func f(3, 7)
-
It is possible to use constants (literals) as type arguments in generics, both as type arguments for types and type arguments for functions.
-
Much of the work for constants in generics was done in ponyta (https://github.com/lukecheeseman/ponyta).
class TT[n: USize, name: String] fun get_sum(k: USize): USize => n * k fun get_name_(): String => name fun get_sum_const[k: USize](): USize => n * k ... var tt: TT[3, "constant string"] = TT[3, "constant string"]
-
Default value type arguments are supported, both in function type arguments and class/struct type arguments.
fun get_sum_const[k: USize = 4](): USize => n * k
-
Added CFixedSizedArray in order to be able to model C fixed sized arrays found in C.
struct S { size_t size; char buf[512]; }
Can be modelled in Pony as
struct S var size: USize = 0 embed buf: CFixedSizedArray[U8, 512] = CFixedSizedArray[U8, 512](0)
make sure to use embed in order to expand the array in the struct. Using
var buf: CFixedSizedArray[U8, 512]would be the samechar (*buf)[512]and Pony would allocate the fixed sized array on the heap.It is possible to emulate a C flexible array member (https://en.wikipedia.org/wiki/Flexible_array_member) that is allocated in C. The Array class can be loaded using the CFixedSizedArray.
struct S var size: USize = 0 embed buf: CFixedSizedArray[U8, 1] = CFixedSizedArray[U8, 1](0) ... var s = @get_struct() ... var ar = Array.from_cpointer(s.buf.cpointer(), s.size)
-
Added offsetof in order to get an offset of a member in a struct or class.
struct S var x: USize var y: USize ... var s = S let off_y = offsetof s.y // Gets the offset of y let off_y_2 = offsetof S.y // It is possible to use the type directly as base
-
Added sizeof operator to obtain the size of any type. Also works on nested types.
struct S var x: USize var y: USize ... var s = S let sz_u32 = sizeof U32 // Any base type can be used let sz_s = sizeof s // Gets the size of the struct s let sz_s_2 = sizeof S // Type can be used directly as base in a dot expression let sz_s_y = sizeof s.y // Size of members can be used let sz_s_y_2_ = sizeof S.y // Also with the type directly as base
-
Keep in mind that both sizeof and offsetof are not compile time constants, meaning they do not behave like a literal and they can unfortunately not be used as type values in generics. sizeof/offsetof are created during the code generation step becoming a compile time constant in the LLVM code and not before that. The reason for this is the the SpacePony compiler uses LLVM in order build target dependent aggregate types in the code generation pass which is one of the last passes. It is possible to make sizeof and offsetof into a literal but that would require using LLVM to build up the types in earlier passes.
-
Structs and classes can be passed as value to FFI functions. Add the annotation
\byval\before the type in the parameter declaration. Annotation was used because it could be easily added without intruding too much on the existing Pony syntax. It also can coexists with other annotations disregarding the order. -
Note that this needs to be manually added for each CPU and OS target as there is currently no functionality in LLVM that lowers the parameters. For more information read this article. There are currently discussions to add an ABI lowering library to LLVM, [RFC] An ABI lowering library for LLVM. If successful and this library is mature, SpacePony will transition to this library because it makes sense. Until then expect bugs as the lowering is tricky and full of corner cases.
-
Currently supported targets:
- x86-64 Linux, Windows
- Aarch64 Linux, Macos
- Arm32 EABI
-
These are not supported or tested but should be easy to get working:
- x86-32 Windows
- x86-64 Macos, which should be System V ABI for x86-64 just as Linux.
use @FFI_Func[\byval\ S](x: \byval\ S, y: \byval\ S) ... var s1 = S var s2 = S var ret = FFI_Func(s1, s2)
-
Bare lambdas also accepts pass by value parameters.
let my_lambda: @{(\byval\ S, \byval\ S): \byval\ S} = @{(x: \byval\ S, y: \byval\ S): \byval\ S => var ret = S ret.x = x.x + y.x ret } ... var s1 = S var s2 = S var ret = my_lambda(s1, s2)
-
Note pass by value in lambdas is currently essentially a double copy (only inside SpacePony). First the argument is copied to the stack and then it is copied to a heap allocated structure. Why? Because there is no escape analysis and the passed aggregate can be sent or stored somewhere and because of that it cannot be on the stack. There is room for future optimizations regarding this, similar to how Pony can allocate on stack rather than heap.
-
Currently not supported is the C
constqualifier. This might affect lowering for some targets and therefore it must taken into account in the future. To map theconstqualifier in SpacePony, one possibility to haveletbe const in the C FFI. However, this doesn't cover everything asembedalso might be const. Adding an annotation\cconst\to the type can cover this.
-
Added support for an LLVM like syntax for inline assembler.
-
The syntax is
asm <assembler_string>, <conststraints>, [return_type](parameters...) end. For more information regarding the LLVM constraints syntax please go to Inline Assembler Expressionsvar res = asm "mov eax, $1\n mov ebx, $2\n mul ebx\n mov $0, eax\n", "=r,r,r,~{eax},~{ebx},~{edx}", [U32](x1, x2) end
-
Multiple return values can be achieved by making the return type a tuple.
(quot, rem) = asm "mov x9, $2\n mov x10, $3\n udiv x11, x9, x10\n mov $0, x11\n msub x12, x11, x10, x9\n mov $1, x12\n", "=r,=r,r,r,~{x9},~{x10},~{x11},~{x12}", [(U32, U32)](y1, y2) end
-
Added an atomic library in the built in package. The atomic type was modelled after std::atomic in C++.
var a: Atomic[U32] = Atomic[U32](10) var r = a + 1 // same as a.add_fetch(1)
-
Added the extra capability called nhb. The nhb capability makes it possible to mutably share a class among actors. This is intended for when it is possible to mutably share a class when the class has implemented some form of custom synchronization like atomics or mutexes.
class NhbClass var x: U64 actor TA var _nc: NhbClass nhb new create(nc: NhbClass nhb) => _nc = nc be beh(nh: NhbClass nhb) => _nc = nh actor Main let _env: Env var _ta: TA var _ta2: TA var _nc: NhbClass nhb new create(env: Env) => _env = env _nc = NhbClass _ta = TA(_nc) _ta2 = TA(_nc) // Yes you can give _nc to several actors var nc2: NhbClass nhb = NhbClass _ta.beh(nc2) _ta2.beh(nc2) // Yes, you can send that nhb class all you want, over and over again. Have fun.
-
Added CTFE to the pony compiler. Currently it is only supported using a
comptime expression endstatement. The compile time evaluation is mandatory inside this expression and a failure to evaluate the expression at compile time will result that the compilation fails. This corresponds toconstevalin C++. -
Example:
var x1: ILong = comptime count_to(10) end ... fun count_to(r: ILong): ILong => var tt: ILong = 0 while tt < r do tt = tt + 1 end tt
This will be run at compile and be reduced to a single literal of type
Ilong. -
The CTFE implementation is currently in its early stages and not all expressions and types are supported yet.
-
The CTFE has taken some ideas and code from (https://github.com/lukecheeseman/ponyta) but most of the implementation is completely new. The
#postexprin ponyta will not be used at all and the#character can be used for future purposes instead. -
It is possible to go really far with CTFE and the goal is to support as many expressions and types as possible. CTFE might be added to more places than only
comptime expression end, which means it is also possible to try running CTFE at key places in the code. It is also possible to use CTFE in order to build up string mixins, self generating code. The D language has string mixins which can be used as a tool for meta-programming. -
Note that the purpose of CTFE is not really optimizations but rather a guarantee that an expression can be evaluated at compile time. LLVM already does constant folding and can do the much of same job as CTFE. One important decision to add CTFE was to be able to have expressions in value type parameters in generics.
-
With CTFE it is possible to generate value type literals from compile time expressions.
class C1[n: I32] fun gen_C1[u: I32, v: I32]() C1[comptime u + v end]
Having
comptimeinside type arguments might be too verbose and unaesthetic so it is possible to use=in front of the expression insteadfun gen_C1[u: I32, v: I32]() C1[= u + v]
Why having
=in front of the expression and not the expression directly? Unfortunately it is a parsing technicality, the=is needed for making the parser selecting the correct rule that otherwise would ambiguous.One big problem with expressions in the type arguments is that there is no type check when they are used. Right now it just accept a type comparison as soon as an expression is encountered. The problem is that the type checks are done in passes prior to the reach pass where the CTFE is being run. Literals can be easily checked for equality, but not an expression that has not been reduced to a literal. Comparing an AST tree is too difficult, take the following example
C1[= a + b + c] is C1[= c + b + a]which is potentially the same type but a different expression yields later in the reach pass the same result in the type argument. This was unresolved in ponyta and currently also unresolved in SpacePony. Hopefully there will be a solution to this in the future. -
It is possible to load and save file during compile time inside a
comptimeexpression usingCompTime.load_fileandCompTime.save_file. What is loaded in compile time can be processed at compile time but also during runtime because the compiler can create constant object(s) of what is loaded. However, when saving data at compile time, the data must be processed at compile time for obvious reasons.comptime CompTime.save_file("123456".array(), CompTimeOutputDirectory, "output.txt") end
var loaded_array = comptime CompTime.load_file(CompTimeWorkingDirectory, "input.txt") // This loads the data and creates a constant array object // Further processing can be made at compile time in order to // make other constant objects based on the input data. end
comptime var array = CompTime.load_file(CompTimeWorkingDirectory, "input.txt") // Load the data ... // do some processing CompTime.save_file(array, CompTimeOutputDirectory, "output.txt") // Save processed data end
-
Added the possibility using enum like declarations inside a primitive, a class or a struct. The Pony language doesn't have enums and the goto method is to use methods inside a primitive like this. However, this adds a lot of boiler plate to just write an enum which is very tedious for large amounts of enums and there is no auto increment of the value. Adding a completely new enum type classification in SpacePony is a lot of work so instead a syntax that lowers the enumerations to methods inside the primitive was chosen. Unlike C/C++ enums, the enums must be give a type and there is no automatic type inference to the smallest possible type that fits the enumerations.
primitive P enum I32 enum1 = 44 enum2 enum3 end
-
Becomes after the lowering, equivalent to this.
primitive P fun \property\ enum1(): I32 => 44 fun \property\ enum2(): I32 => 44.add(1) // 45 fun \property\ enum3(): I32 => 44.add(2) // 46
-
When there are many enums, this makes it more easier to write. C interfaces can sometimes have many enums, often as some return status values and the ability to copy and paste most of it makes it easier.
-
Any type can be used in enums, however for auto-incrementation the type must support the add method. If not every enum must be given a value.
-
Several enums per line are supported using
;delimiterprimitive P enum I32 enum1 = 1; enum2; enum3 end
-
By using lowering and using existing primitives for implementing enums was the path of least resistance. Adding "real" enums would require much more work in order to make it work with the existing type system. Also the Pony language already support enums using a union of types.
primitive P1 primitive P2 primitive P3 primitive P4 type PUnion is (P1 | P2 | P3 | P4)
-
The Pony type unions are ok for a moderate amount enums, both in terms of typing and code generation. This is usually the goto method when the is no interest what the underlying representation is, meaning no conversion to some integer needed. Under the hood a primitive is a pointer to a global aggregate, like a class but without any members. Since there are no members it can be made an immutable global. For large amounts of enums, there might be missed optimizations opportunities as random pointers cannot easily be converted to jump tables. In this particular case monotonous increasing enums might be better. Despite having implemented enums using lowering in primitives, classes and structs, a "real" enum in SpacePony isn't off the table.
-
Added an
propertyannotation for methods in order to remove the requirement to have parentheses for method calls that has no parameters and returns a value. This is especially useful for enums that are under the hood are methods but gets the look and feel as it was constant variables. It is similar what is available by default in the D language and C# has properties as well, descibed in the C# programming guide- A property is a member that provides a flexible mechanism to read, write, or compute the value of a data field. Properties appear as public data members, but they're implemented as special methods called accessors.
-
The SpacePony enums inside primitives are automatically given the
propertyannotation. Any method that with the right signature can have the 'propery' annotation.primitive P enum I32 e1 // enums will have the 'property' annotation by default e2 e3 end fun \property\ m1(): I32 => 22 // A method can be a property member class C var x: I32 new create(x': I32) => x = x' fun \property\ get_x(): I32 => x // It is not limited to literals and can // return any run time calculation ... let x = P.e1 // derferencing a property method looks like accessing a member variable let y = P.m1 let c1 = C(33) let z = c1.get_x
-
A property can also be a partial. Just add
?after the name.class C var x: I32 new create(x': I32) => x = x' fun \property\ partial_x(): I32 ? => if x < 10 then error else x end ... var c = C(5) var c_x: I32 = 0 try c_x = c2.partial_x? else c_x = 11 end // c_x will have the value 11
-
Write properties are supported by adding "_w" at the end of the method name and it must have one parameter. It is not allowed to shadow an existing variable. Partial write properies are supported in the same way as read properties.
class C var x: I32 new create(x': I32) => x = x' fun \property\ prop_x(): I32 => x // This is the read property fun \property\ ref prop_x_w(x': I32) => x = x' // This is the write property ... let c1 = C(33) c1.prop_x = 44
-
In Pony language when a field or variable is assigned, the old value is the result of the expression
var x: I32 = 0 let y = (x = 2) // y gets the value 0
-
Returning the old value behaviour is optional with properties. It is possible not to define a return type, which means that the property just returns
None. It also possible to define a return type and then it is possible to return the value before the operation, or any value for that matter. This is similar to how the postfix increment (operator++(int)) in C++ is usually implemented. -
The extra "_w" is a way to overcome that the Pony language doesn't have any function overloading. If function overloading would be supported then the obvious choice would be that the property methods for read and write would have the same name. It is possible that function overloading will be implemented in the future and then the extra "_w" at the end will be removed as it is no longer needed.
-
Added the possibility to check if a type is an entity type in an iftype expression.
fun is_class[A: AnyNoCheck](): Bool => iftype A <: class then true else false end
-
It is possible to include entity types into a union in order to check for several entity types.
fun is_class_or_struct[A: AnyNoCheck](): Bool => iftype A <: (class | struct) then true else false end
-
Right now there is a class called FixedSizedArray which is copy of CFixedSizedArray but implemented as a class with a type descriptor. However, since it is already "spoiled" with a an extra class member, why not add a size field as well. In this case FixedSizedArray can be converted to a FixedCapacityArray instead. It is unclear what is the most useful, if FixedCapacityArray or FixedSizedArray are necessary at all in addition to CFixedSizedArray.
-
Since there are suddenly several different arrays, a Slice class might be needed. This would be equivalent to std::span in C++. However, it is also possible to reuse the Array class for this purpose as it is possible to load the Array with outside raw pointers. This is possible because the Array class use garbage collected pointers which can co-exist with foreign raw pointers. The D language has chosen this approach where there is a merge between the slice and the dynamic array. Slices or reuse Array as both their pros and cons.
-
Real asynchronous IO and not a POSIX like wrapper. An API that can be used for anything streaming like Files, HTTP, TCP. The API should also use the best available asynchronous OS API primitives.
-
Implement a good and comprehensive reflection interface.
-
Replace
usefor FFI definitions toextern. Right now the keyworduseis reused for defining external C calls, with additional conditions trailing the definition. These conditions can easily get out of and become unreadable. For exampleuse @pony_os_writev[USize](ev: AsioEventID, iov: Pointer[(Pointer[U8] tag, USize)] tag, iovcnt: I32) ? if not windows
As different targets increases it will be
if not windows and not anothertarget and not yetanothertarget. Instead choosing what should be defined should be chosen byifdefsimilar to#ifdefin C. Instead it should beifdef windows then extern(C) @_write[I32](fd: I32, buffer: Pointer[None], bytes_to_send: I32) end
this allows for more complex conditional statements that are also readable. Notice how the example above uses
extern(C)instead ofuse. The reason is that currentlyusecannot be extended. For example if supporting importing C++ functions, thenuseat the current form cannot be extended. It might be possible to extendusewithuse(C++)for example but how it can be extended remains an open question. -
Implement function overloading.
-
Move to MLIR backend
-
GPU offload. The capability/actor model of Pony is very well suited for running parts of the actors in GPUs.
-
Improve the Pony type system when it comes to reification.