6

Tooling :: Interop story for .NET libraries using C# source generators for high...

 1 year ago
source link: https://github.com/dotnet/fsharp/issues/14300
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Contributor

T-Gro commented Nov 11, 2022

edited

I want to open the discussion on consumption of .NET libraries built using C# source gen support https://learn.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/source-generators-overview .

With every .NET version, more APIs use it and is part of the reason of the unprecedented performance boosts of .NET platform for various computing tasks. A few examples include:

And I think it is only about time before database libraries or structured loggers (incl. telemetry) will have it as well.

I do not believe that F# needs it's own clone of source generators, there are concepts like type providers or Myriad which allow accomplishing similar goals. Therefore I did not continue this as a comment on fsharp/fslang-suggestions#864 , which I believe has different aspirations.

I do want to reopen the discussion from the .NET library consumption perspective, especially around libraries/frameworks that are massively backed and invested in (like aspnetcore), and are expensive to replicate and part of the big performance wins.
F# is " | Succinct, robust and performant language for .NET" and the ability to consume the fastest of .NET's libraries IMO comes with this motto.

The latest resolution I could find on the topic is "use C# project" fsharp/fslang-suggestions#864 (comment) , which makes sense in the short term (< 3 years) perspective.

However, the broader the usage of C# code gen in well established libraries, the more slices would have to be done in a project to separate the F# pieces (where F# programmers wants to write) and C# parts (simply due to libraries needing that). Important to note that it might spread multiple layers of the application, and therefore isn't just "1 F# and 1C# project", but rather an interleaved sandwich depending on the level of the stack a library is targeting. The cognitive complexity of seeing a project (which turned into a solution by now) like this is objectively bigger and the typical display by Solution Explorer does not make the dependency order visible at a first glance.

Which brings me to the tooling topic - what can we do better in order to support a smooth workflow using such libraries in a project that would want to be F#-only otherwise.

There is an older suggestion about mixed projects fsharp/fslang-suggestions#117 , which was correctly resolved as being a tooling issue and not a change to F# language itself.

My current view is that the user-facing side of this feature could look like embedding a single standalone C# file into middle of a F# project.
That C# file would have access to all project/package dependencies (this is where the source gen stuff is), F# files before it, NOT the files after it ;; and would be only accessible by F# files coming after it.
If this eliminates any worries, I think this would be handy even if always restricted to 1-C#-file scenarios only.

From the IDE side, I could imagine this being a "lightweight project within project", as well as a .cs file being within .fsproj and the F# compiler knowing to split the project into multiple compilation units, invoke Roslyn underneath and putting the results together in the right order.

I will wait for someone more knowledgeable to assess if merging the produced C# & F# ILs together is even a theoretical thought, or it if this would have to be independent .dlls on the output.

This is an XXL item

daniellittledev and eriawan reacted with thumbs up emoji

This is a really cool idea.

I wonder if a first step in this could be to force the .cs files to be first, but that is largely due to a specific concern, so I will explain it.

Due to how Roslyn source gen works, you cannot separate source gen from compilation. Because of this, I believe we will have to hydrate a C# project, and that in VS that would have to be hydrated in the workspace and remain part of the design time build for the project. From the VS perspective, I think that the C# project would have to appear real. Happy to have those wiser on VS correct me.

Project and package dependencies are giving me a bit of a headache right now, so let's just assume that is solvable.

Since F# is order dependent in the project file, I do not have my head around a way to do this with partial dependencies - the ones that would be available part way through the F# compilation. That is why I suggested that at least for a first cut, we have the C# dependencies isolated. If that makes the feature worthless, let's know that up front.

The alternative, and maybe this is what you have in mind, is that the F# compiler would maintain a separate C# transient project for every set of non-sequential set of .cs files. I do not see how this could work because the C# project would view the F# project via project dependencies and a) that would be a circular ref which is not supported and b) what F# project is available halfway through a F# compilation?

I look forward to feedback on this!

Adding @chsienki in hopes he can look at this question from a different perspective.

PS. In the examples listed, some seem relatively timeless such that we could do work to provide the same code in a different way with a potentially different gesture in F# - JSON and RegEx. Probably gRPC is pretty stable, but there are many areas that are not. I agree that if we do nothing F# will be disadvantaged in these scenarios and am interested in how important getting the performance in the C# way is to people. Could we work with the community to find more F# answers for the scenarios that matter, learning from the work in C#. Happily, source generators make understanding what C# is doing to gain performance easy to understand (or as easy as possible in the case of RegEx ;-)

Member

vzarytovskii commented Nov 11, 2022

edited

One problem is that many C# code generators rely on partial assemblies/types, which means we either have to:

  • Support those as "augmentations" to types in F#.
  • Allow C# files to be part of F# project, and use it and an artificial implicit project to run source generator, and produce assembly which we then use.
    • Multiple problems with this: generators won't really have access to any F# types.
  • Probably a bunch of other options, like cross-compiling F# to C# or something wild like this.

That said, I think we should take TOP libraries which use source gen and see their use-cases, and what's needed from us to support those.

Contributor

Author

T-Gro commented Nov 11, 2022

edited

This is a really cool idea.

I wonder if a first step in this could be to force the .cs files to be first, but that is largely due to a specific concern, so I will explain it.

Due to how Roslyn source gen works, you cannot separate source gen from compilation. Because of this, I believe we will have to hydrate a C# project, and that in VS that would have to be hydrated in the workspace and remain part of the design time build for the project. From the VS perspective, I think that the C# project would have to appear real. Happy to have those wiser on VS correct me.

Project and package dependencies are giving me a bit of a headache right now, so let's just assume that is solvable.

Since F# is order dependent in the project file, I do not have my head around a way to do this with partial dependencies - the ones that would be available part way through the F# compilation. That is why I suggested that at least for a first cut, we have the C# dependencies isolated. If that makes the feature worthless, let's know that up front.

The alternative, and maybe this is what you have in mind, is that the F# compiler would maintain a separate C# transient project for every set of non-sequential set of .cs files. I do not see how this could work because the C# project would view the F# project via project dependencies and a) that would be a circular ref which is not supported and b) what F# project is available halfway through a F# compilation?

I look forward to feedback on this!

Adding @chsienki in hopes he can look at this question from a different perspective.

PS. In the examples listed, some seem relatively timeless such that we could do work to provide the same code in a different way with a potentially different gesture in F# - JSON and RegEx. Probably gRPC is pretty stable, but there are many areas that are not. I agree that if we do nothing F# will be disadvantaged in these scenarios and am interested in how important getting the performance in the C# way is to people. Could we work with the community to find more F# answers for the scenarios that matter, learning from the work in C#. Happily, source generators make understanding what C# is doing to gain performance easy to understand (or as easy as possible in the case of RegEx ;-)

Indeed, the solution I had in mind was spliting the user-visible project and doing a separate compilation unit for each block, treating change of languge as a switch into a new unit.
So in this case, there would be 5 (!) compilation units, each having reference to it's predecessors as being separate assemblies.
That would also mean that in context of an F# project, the .cs files would NOT see each other bidirectionally, and the visibility would follow the project order as it does with F# files.

After those 5 separate compilation units are done, it would be of course good to put them back together into a single .dll. If that is doable, I do not know. (e.g. if a .dll created this way and containing output from two different compilers could create issues somewhere down the road when consumed)

It might look crazy to do 5 different compilation units, but in the end this is what users do today when separating those into projects manually.

image

Contributor

En3Tho commented Nov 14, 2022

edited

One of the ways is maybe trying to embed a C# code piece to F#
Many simple but useful things like LibraryImportGenerator or RegexGenerator only use single partial method and an attribute to flag source generation. I guess they can be a goal for a start?

F# code ...
```csharp // like an md for example
public class FastRegex
{
    [RegexGenerator("WowF#")]
    public partial Regex MyCoolRegex();
}
``` //

if FastRegex.MyCoolRegex.IsMatch(...) then

Props:

  1. It sorta has a natural bit to F# in a sense that code above won't know about MyCoolRegex and code below will (at least this is the idea).
  2. You don't need to make a dedicated file for this.

Cons:

  1. Looks out of the place.
  2. All the ceremony with files is still there - need to think how to extract this code bit to a dedicated file, pass it to roslyn, import back, place breakpoint etc and also how to restrict accessability
  3. Need strict rules about where such code can be placed (I guess inside namespace only or inside a module but namespace feels easier to do)

One of the options is trying to revive F# -> Roslyn interop.
But as @vzarytovskii stated F# needs to have a support for "partial" at least.

With new "file" modifier I belive some of the complexity is gone because generators do not need to scan assemblies for similar type names, resolving conflicts etc. This might be easier to do now.

Pros:

  1. Do not need .cs files at all (at this stage at least), feels very natural to F#

Cons:

  1. Need to create both export and import to Roslyn / from Roslyn: export F# AST => C# AST for SG, wait for SG, import C# AST => F# AST (virtually or in other ways)
type Regex with
    [<RegexGenerator("WowF#")>]
    static member MyCoolRegex() = partial // keyword?

if Regex.MyCoolRegex().IsMatch(...) ...

The main idea behind thise ideas is trying to make generated stuff visible to code just right below it. To not introduce a "hard" split in code a logic. I belive this might be one of the hardest things?

Member

vzarytovskii commented Nov 15, 2022

One of the options is trying to revive F# -> Roslyn interop.

This would be an extremely fragile solution and will require constant changes adapting to all roslyn changes.

Contributor

dsyme commented Nov 15, 2022

edited

This is a big topic, and I like your framing @T-Gro. Above all it's very important to approach anything in this space from the perspective of "how are we going to implement this", including in the IDE. Anything here requires very deep changes to how compilation and analysis proceed and needs very close attention to detail.

On the whole I'm going to stay out of this directly - it's important, but not my battle :) I'll jot a few notes which might be useful.

  • The framing you have is good - "think of it as a single C# file in an F# compilation" - as are the subsequent discussions about projects etc.

  • There is an existing mechanism to inject arbitrary .NET content mid-way through the F# compilation process - generative type providers. Generated types are provided by handing over an assembly and the types are rewritten at the IL level to become part of the output assembly. It's worth noting we added this for very similar reasons - .NET 1-3.x libraries were using C# code generation tools and we wanted that available in F#.

    The handover is actually pretty simple - the generative type provider reports DLLs to incorporate (via provided types that have a different assembly), and the types are rewritten and renamed as part of compilation. There is no integration with projects or build (so the TP must detect changes in inputs - e.g. DB schema - and report invalidation), and the TP has no access to types generated in the current assembly.

    I believe these could today be used to host arbitrary C# code in a CSharpProvider<" cs code ">.

    Is this a useful starting point? I'm not sure. At the high level there's no reason the TP architecture couldn't be modified a bit to allow that input to be in a source generator file instead (and if necessary adjusted so no explicit declaration is even needed in source code). Does that get close enough that you could extend and modify the mechanism to host source generators? I'm not sure. Maybe. It's worth thinking about.

    1. Certainly F# (https://github.com/fsharp/fslang-design/blob/main/RFCs/FS-1023-type-providers-generate-types-from-types.md) is necessary. That would be a very powerful addition to F# in any case.

    2. Some holes may need to be fixed in what TPs can provide.

    3. The C# code may provide partial types, as mentioned above. But perhaps the TP architecture could be adjusted to allow merging of types into F# types.

    4. Regarding IDE builds and dependencies and projects and so - the TP architecture would need to be adjusted/enriched to allow the TP to actively host a design-time build for the C# project. Or else the TP would simply be re-run in some non-incremental mode. I'm not sure.

One advantage of using an extensibility point is that the code generator and Roslyn compilation would be held "at arm's length", hosted in the TP. Further you could version that component separately. In principle you could alternatively design add a different extensibility point that achieved a similar thing. I've got a feeling it would look a lot like generative TPs.

Anyway, on the whole I'd recommend having a good think about factoring things this way. That is, via an extension point, rather than direct integration. Maybe F#-for-.NET would then come with a RoslynSourceGenerator TP thing with all the build logic automagically hooked up. Maybe not. But decoupling may be very valuable here.

If you did go down the route of extending the existing TP mechanism, other good things could potentially drop out, e.g.

Some general comments - I personally think F#'s future existence is firmly rooted in being both a Javascript language and .NET language - and we should assess everything we do from this perspective. We must also focus on F#'s own existence as its own set of libraries and ecosystem, rather than always being downstream from .NET change and churn - most of which is now frankly treating .NET as a single-language ecosystem.

To put that in perspective, in the past 90% of our efforts have been to interoperate with .NET assets. While that's been great for properly-designed truly cross-language core libraries, it's often not turned out to be very fruitful for anything that involves complex compilation (e.g. IDE tooling using code generation, likewise database and service generators). We can burn a lot of time and energy to interoperate with these libraries, and doing so can suck us into very deep dependencies on C# both technically and culturally. So I recommend looking for an approach to this that is fundamentally F#-first, where what you want drops out as an instance of a more generic capability.

En3Tho, auduchinok, TheAngryByrd, robitar, davidglassborow, and daniellittledev reacted with thumbs up emoji

Contributor

En3Tho commented Nov 24, 2022

edited

One of the problems I recently hit when trying to make a thin (as it could possibly be) wrapper around Blazor is that it's currently impossible to inline ast/make type partial. There is a Myriad and it's a good tool I guess but it suffers from this limitation too. I can actually imagine partial modules. It should be a thing with least amount of limitations and obstacles. Not sure about partial types tho. @dsyme can you please share if you ever given a though about partial modules/types?

To illustrate the situation:
Consider we have a component like this:

type HelloWorldFSharp() =
    inherit ComponentBase()

    [<Parameter; EditorRequired>]
    member val Name = "" with get, set

    [<Parameter>]
    member val Name2 = "F#" with get, set

    override this.BuildRenderTree(builder) =
         builder.Render(blazor {
             h1 {
                 $"Hello, {this.Name} from {this.Name2}!"
             }
        })

The obstacle is that Name and Name2 are set via RenderTreeBuilder meaning not just directly Name = ... and Name2 = ... So I've decided that codegen is the best thing I can do here:

[<AutoOpen>]
module HelloWorldFSharp__Import =
    open FSharpComponents
    open System

    type [<Struct; IsReadOnly>] HelloWorldFSharp__Import(builder: BlazorBuilderCore) =

        member this.Name2 with set(value: String) =
            builder.AddAttribute("Name2", value)

        interface IComponentImport with
            member _.Builder = builder

    type HelloWorldFSharp with
        static member inline Render(builder: BlazorBuilderCore, name: String) =
            builder.OpenComponent<HelloWorldFSharp>()
            builder.AddAttribute("Name", name)
            HelloWorldFSharp__Import(builder)

And then import:

type Importer() =
    inherit ComponentBase()
    override this.BuildRenderTree(builder) =
        builder.Render(blazor {
            fun b -> HelloWorldFSharp.Render(b, "C#", Name2 = "VB")
        })

The problem is that this generated import and Render extension should live right below HelloWorldFSharp type. Currently it is impossible unless you write this thingy by hand.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK