feat(compiler): FFI and C# bindings for RVM, incl. host-await built-ins#672
Open
kusha wants to merge 10 commits into
Open
feat(compiler): FFI and C# bindings for RVM, incl. host-await built-ins#672kusha wants to merge 10 commits into
kusha wants to merge 10 commits into
Conversation
fa62435 to
54fbb82
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
This PR extends the RVM host-await feature end-to-end by wiring “registered host-await builtins” through the Rust compiler, VM accessors, the FFI layer, and the C# bindings, with accompanying docs and tests.
Changes:
- Add compiler support to register host-awaitable builtin names at compile time and emit
HostAwaitfor natural function-call syntax. - Expose host-await suspension inspection (identifier/argument) and run-to-completion response preloading via FFI + C# APIs.
- Add Rust YAML regression cases plus new C# binding tests and documentation updates.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/rvm/rego/mod.rs | Extends YAML harness to pass registered host-await builtins into compilation and (optionally) assert host-await arguments in suspendable mode. |
| tests/rvm/rego/cases/registered_host_await.yaml | Adds regression coverage for registered builtins (shadowing, queueing, arg-count validation, overrides). |
| src/rvm/vm/machine.rs | Adds VM accessors for host-await identifier/argument when suspended. |
| src/languages/rego/compiler/rules.rs | Adds compile_from_policy_with_host_await and wires registrations into compilation. |
| src/languages/rego/compiler/mod.rs | Adds storage + registration API for host-awaitable builtins (with validation). |
| src/languages/rego/compiler/function_calls.rs | Updates call resolution/emission to support registered host-await builtins and emit identifier literals. |
| docs/rvm/vm-runtime.md | Documents new VM host-await accessors. |
| docs/rvm/instruction-set.md | Documents registered host-await builtin behavior and resolution order. |
| docs/rvm/architecture.md | Describes explicit vs registered HostAwait emission paths. |
| bindings/ffi/src/rvm.rs | Adds FFI structs/APIs for compile-with-builtins, response preloading, and suspension JSON accessors. |
| bindings/csharp/Regorus/Rvm.cs | Adds C# wrappers for host-await inspection + response preloading. |
| bindings/csharp/Regorus/Program.cs | Adds CompileFromModules overload supporting host-await builtin registrations; refactors compile paths. |
| bindings/csharp/Regorus/NativeMethods.cs | Adds P/Invoke declarations and FFI struct for host-await builtins + new VM functions. |
| bindings/csharp/Regorus/ModuleMarshalling.cs | Generalizes UTF-8 string pinning helper and adds host-await builtin marshalling. |
| bindings/csharp/Regorus/Compiler.cs | Introduces public HostAwaitBuiltin struct for registration. |
| bindings/csharp/Regorus.Tests/RvmProgramTests.cs | Adds suspendable + run-to-completion tests for registered host-await builtins. |
| bindings/csharp/README.md | Adds usage examples for registered host-await in both execution modes. |
| bindings/csharp/API.md | Documents new/expanded Program/Rvm/HostAwaitBuiltin APIs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Allow hosts to register function names at compile time so that calls to those names emit HostAwait instructions directly, enabling natural syntax like fetch(x) instead of __builtin_host_await(x, "fetch"). - Add host_await_builtins map and register_host_await_builtin() to Compiler - Validate arg_count == 1 and reject reserved __builtin_host_await name - Extend determine_call_target() resolution: explicit > registered > user > builtin - Both explicit and registered paths emit identical HostAwait bytecode - Add compile_from_policy_with_host_await() entry point in rules.rs - Extended test harness with HostAwaitBuiltinSpec and args assertion - 9 YAML test cases: suspend/resume, run-to-completion, multiple names, queue, shadowing, object packing, arg_count rejection, reserved name rejection, standard builtin override - Documentation: instruction-set.md, architecture.md
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Mark Birger <birgerm@yandex.ru>
… registration - Compiler::register_host_await_builtin now rejects duplicate, empty, and whitespace-only names. Previously a duplicate registration would silently overwrite the existing entry, which could mask the host's own registration mistakes. - YAML test cases added: empty registration list as no-op, duplicate name rejection, empty/whitespace name rejection, out-param (a, out) calling syntax with a single-arg registered builtin, and mixed __builtin_host_await + registered builtins in the same policy consuming from their respective identifier queues. - Test harness: replace assert_eq! on HostAwait argument mismatch with anyhow::Error so mismatches propagate through the case reporter instead of panicking and skipping the harness's normal error path. - YAML comment fix: "Registration panics" -> "Registration fails with an error" (registration returns Err, never panics). Addresses anakrish + Copilot inline review comments on PR microsoft#667. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…riants Addresses PR microsoft#667 review item microsoft#8: at the emit site in `compile_function_call`, the discrimination between explicit `__builtin_host_await(arg, id)` and a registered host-awaitable builtin was being recovered by string-comparing `original_fcn_path` against `"__builtin_host_await"`. The information was already known in `determine_call_target` and was being thrown away. Replace the single `CallTarget::HostAwait` variant with two: * `ExplicitHostAwait` (unit) — the two-argument call form. The identifier register comes from the user's second argument. * `RegisteredHostAwait { identifier: String }` — the one-argument call form for registered builtins. The identifier is the registered name and is captured in the variant at recognition time, so the emit site never re-derives it from the function path. This removes the magic-string comparison at the emit site (the source of truth is now `determine_call_target`) and makes both match sites in `compile_function_call` exhaustive over the two forms — adding a third host-await form in the future would force a compile error at every match site instead of silently falling through. Arities are now hardcoded in the `expected_args` extraction (`Some(2)` for explicit, `Some(1)` for registered) rather than carried in the variant; registered builtins are constrained to `arg_count == 1` at registration time, so there is no per-call variability to carry. Bytecode output is unchanged; the full RVM test suite (97 cases) and the registered_host_await suite (15 cases) pass without modification. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…calls only PR microsoft#667 review (Medium): the docs implied registered host-await names shadow user functions and builtins unconditionally, but determine_call_target matches only the bare original_fcn_path. A package-qualified call such as data.demo.resolve(x) is therefore not intercepted -- it resolves through the normal path like any other call. Rather than expand registration to qualified paths (which would let a registered name leak into every package exposing a same-named rule), document the unqualified-only behavior and pin it with tests. - register_host_await_builtin: doc now states only the unqualified call form is intercepted; qualified calls resolve normally. - determine_call_target: inline comment explaining the deliberate original_fcn_path-only match. - docs/rvm/instruction-set.md: describe qualified-call resolution, including that builtins have no qualified form. - tests: cross-package and same-package qualified calls resolve to the rule; bare-name shadowing of a standard builtin; Unknown-function outcome when no rule exists at the qualified path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR microsoft#667 review (Low): the suspendable test harness compared the host-await argument via process_value(argument), but argument is already a runtime Value. process_value is a YAML-fixture decoder -- it rewrites "#undefined" to Undefined, {set!: [...]} to a set, and errors on a runtime Value::Set. Re-running it on the runtime argument could coerce a legitimate payload into a fixture sentinel (passing for the wrong reason) or error outright on sets. Compare the runtime argument directly against the expected value, which is already decoded once at YAML load time. Add a regression case (registered_builtin_suspendable_set_argument) that passes a set payload: it fails under the old double-processing ("unexpected set in value read from json/yaml") and passes with the fix. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…mode PR microsoft#667 review (Low): a run-to-completion host-await response could carry an `args:` payload expectation, but RTC execution pre-loads responses and never surfaces the call argument to the harness, so the expectation was parsed and silently dropped. A case with `args: "WRONG"` passed as long as the result matched -- asserting a payload that was never checked. Reject `args:` for run-to-completion fixtures at load time, directing the author to suspendable mode where arguments are validated. Also only build the run-to-completion response vector when the case actually runs in RTC mode, so a suspendable case using the shared host_await_responses field with `args:` is not wrongly rejected. Route the fixture-load error through the same want_error handling used for compilation errors, and add registered_builtin_run_to_completion_rejects_args which now fails loudly instead of passing silently. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…space PR microsoft#667 review (Low): register_host_await_builtin rejected all-whitespace names via name.trim().is_empty(), but accepted padded names like " lookup" or "lookup ". Those were inserted into host_await_builtins, but Rego function-call paths produce the trimmed identifier, so a padded registration could never match -- a silent dead registration. Reject any name that is not already trimmed (name != name.trim()) in addition to empty names, and update the error message accordingly. Add test cases for leading and trailing whitespace. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Expose registered host-await builtins, Program compilation, and RVM accessors through the FFI layer and C# bindings. FFI (bindings/ffi/src/rvm.rs): - compile_from_modules with host-await builtin registration - set/get host-await responses, argument, identifier - RegorusHostAwaitBuiltin C struct C# (bindings/csharp/Regorus/): - Program.CompileFromModules overloads with HostAwaitBuiltin[] - Rvm.SetHostAwaitResponses, GetHostAwaitArgument, GetHostAwaitIdentifier - HostAwaitBuiltin readonly struct, ExecutionMode enum - ModuleMarshalling: PinnedUtf8Strings, PinnedHostAwaitBuiltins Rust (src/rvm/vm/machine.rs): - get_host_await_argument() and get_host_await_identifier() accessors Docs: README examples, API.md reference, vm-runtime.md accessors Tests: suspendable + run-to-completion C# scenarios
- HostAwaitBuiltin: single-arg (name) ctor; arg_count fixed to 1 - SetHostAwaitResponses: multi-identifier dictionary API - always route CompileFromModules through host-await FFI - dead-check/IIFE cleanup, pub(crate), expanded tests + docs
54fbb82 to
145ee89
Compare
Collaborator
|
@ review this PR using all the review skills in the repo. Use a separate agent for each skill. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Exposes the registered host-await builtins,
Programcompilation, and RVM runtime accessors through the FFI layer and C# bindings. This is the companion to #667 (registered host-await builtins in the compiler/VM), making the feature usable from C# consumers.Motivation
The previous PR added registered host-await builtins to the Rust compiler and VM. However, the FFI boundary and C# bindings only exposed the raw
__builtin_host_awaitpath. This PR bridges the gap so C# consumers can:Program.CompileFromModules, which emitsHostAwaitinstructions directly.Rvm.SetHostAwaitResponsesto queue responses (for any number of identifiers) before execution.Rvm.GetHostAwaitIdentifier()andRvm.GetHostAwaitArgument()to determine which builtin suspended and with what argument.Changes
Rust VM (
src/rvm/vm/machine.rs)get_host_await_argument()andget_host_await_identifier()-const fnaccessors that return the argument/identifier when the VM is in a HostAwait-suspended state.FFI (
bindings/ffi/src/rvm.rs)RegorusHostAwaitBuiltin-#[repr(C)]struct carrying a builtin name. The argument count is fixed to 1 by the compiler, so it is not part of the FFI surface.regorus_program_compile_from_modules_with_host_await- compile from modules with registered host-await builtins; the non-host-await path is treated as the empty-builtins case.regorus_rvm_set_host_await_responses- pre-load responses for run-to-completion mode. Accepts an array ofRegorusHostAwaitResponseSet(one per identifier), so multiple identifiers are configured in a single call.regorus_rvm_get_host_await_argument/regorus_rvm_get_host_await_identifier- JSON accessors for suspension state.C# Bindings (
bindings/csharp/Regorus/)HostAwaitBuiltin- readonly struct wrapping a builtin name (single-argument constructor; the arg count is fixed to 1 at the compiler level).ExecutionMode- enum (RunToCompletion,Suspendable).Program.CompileFromModules- overload acceptingIReadOnlyList<HostAwaitBuiltin>.Rvm.SetHostAwaitResponses(IReadOnlyDictionary<string, IReadOnlyList<string>>)- queue JSON responses for any number of identifiers in one call. Replaces the entire response configuration.Rvm.GetHostAwaitArgument()/GetHostAwaitIdentifier()- read suspension state.ModuleMarshalling-PinnedUtf8Strings,PinnedHostAwaitBuiltins, andPinnedHostAwaitResponseSetshelpers for safe FFI marshalling.Documentation
bindings/csharp/README.md- suspendable and run-to-completion examples with registered builtins.bindings/csharp/API.md-Program,Rvm,HostAwaitBuiltin,ExecutionModeAPI reference.docs/rvm/vm-runtime.md- the new VM accessors.Tests (
bindings/csharp/Regorus.Tests/RvmProgramTests.cs)RegisteredHostAwait_Suspendable_SuspendAndResume- registersget_account, suspends, verifies identifier and argument, resumes with a response.RegisteredHostAwait_RunToCompletion_WithPreloadedResponses- registerstranslate, pre-loads a response, verifies end-to-end execution.RegisteredHostAwait_RunToCompletion_MultipleIdentifiersInSingleCall- preloads responses for two identifiers in oneSetHostAwaitResponsescall.RegisteredHostAwait_RunToCompletion_FailsWhenResponseQueueExhausted- missing response surfaces as an error.RegisteredHostAwait_GetAccessorsReturnNullWhenVmIsNotSuspended- accessors return null outside a suspension.RegisteredHostAwait_CompileRejectsEmptyOrWhitespaceName/CompileRejectsDuplicateRegistration/CompileRejectsReservedName- registration validation errors surface through the bindings.Notes
SetHostAwaitResponsesreplaces the full response set: it configures responses for all supplied identifiers in a single call and replaces any previously configured responses. This mirrors the underlyingRegoVM::set_host_await_responsesreplace-all semantics (pre-existing upstream behavior).CompileFromEnginedoes not yet support host-await builtins: OnlyCompileFromModulesacceptsHostAwaitBuiltin[]. This is a scoping choice to keep the PR smaller - there is no technical constraint preventing it. The engine-based path can be extended in a follow-up if needed.