Skip to content

Conversation

@TangoYankee
Copy link
Member

Document API naming convetions for methods, functions, and OpenAPI operations

Document API naming convetions for methods, functions, and OpenAPI
operations
@TangoYankee TangoYankee requested a review from a team February 4, 2025 18:04
@TangoYankee
Copy link
Member Author

We recently had a discussion around abbreviating "city council district" and "community district" as "cd" and "ccd". The objection to using these initialisms is because it's difficult to remember what they mean. The general rule is that we should favor writing out the full word when practical.

However, we write bbl as the example parameter in this guidance. Also, Neighborhood Tabulation Areas are widely abbreviated as nta and it would be tedious to write it out every time. Id itself is an abbreviation of Identification. Clearly, there are terms we feel better served by abbreviating.

Personally, I think we should still prefer to write out terms. However, if a term has an abbreviation that has reached common use, it is better to use this abbreviation.

@dhochbaum-dcp
Copy link

What should one name a function without a limit, if a corresponding function with a limit exists?

@TangoYankee
Copy link
Member Author

What should one name a function without a limit, if a corresponding function with a limit exists?

I'm not sure I understand the context enough. Is there a specific example you're thinking of?

@dhochbaum-dcp
Copy link

Let's say I want to add a function like this findMany but without a limit or offset.

@dhochbaum-dcp
Copy link

We recently had a discussion around abbreviating "city council district" and "community district" as "cd" and "ccd". The objection to using these initialisms is because it's difficult to remember what they mean. The general rule is that we should favor writing out the full word when practical.

However, we write bbl as the example parameter in this guidance. Also, Neighborhood Tabulation Areas are widely abbreviated as nta and it would be tedious to write it out every time. Id itself is an abbreviation of Identification. Clearly, there are terms we feel better served by abbreviating.

Personally, I think we should still prefer to write out terms. However, if a term has an abbreviation that has reached common use, it is better to use this abbreviation.

I'm unclear on what you mean by "common use". The common use of bbl has a different meaning for the general population than it does in-context for the users of our products. Those same people who see bbl and think "Borough, Block, and Lot" likely see cd and think "Community District". Are we writing out "Community District Tabulation Areas" or using "CDTA"? Are we expecting people who know what "CDTA" means not to know what "CD" means?

@TangoYankee
Copy link
Member Author

Let's say I want to add a function like this findMany but without a limit or offset.

Ahh, it would still be findMany but the function signature wouldn't take a limit parameter. This is how we handle boroughs, which we do not paginate.

I prefer using findMany for both cases over using findAll for the non-limit case. I don't really see a case where we want a limit and a non-limit version to exist in the same domain- we either want to paginate that data or we don't. We can use findMany and it still makes sense for either use case, meaning we can more easily refactor if we do want to add or remove pagination from that domain.

@TangoYankee
Copy link
Member Author

TangoYankee commented Feb 12, 2025

We recently had a discussion around abbreviating "city council district" and "community district" as "cd" and "ccd". The objection to using these initialisms is because it's difficult to remember what they mean. The general rule is that we should favor writing out the full word when practical.
However, we write bbl as the example parameter in this guidance. Also, Neighborhood Tabulation Areas are widely abbreviated as nta and it would be tedious to write it out every time. Id itself is an abbreviation of Identification. Clearly, there are terms we feel better served by abbreviating.
Personally, I think we should still prefer to write out terms. However, if a term has an abbreviation that has reached common use, it is better to use this abbreviation.

I'm unclear on what you mean by "common use". The common use of bbl has a different meaning for the general population than it does in-context for the users of our products. Those same people who see bbl and think "Borough, Block, and Lot" likely see cd and think "Community District". Are we writing out "Community District Tabulation Areas" or using "CDTA"? Are we expecting people who know what "CDTA" means not to know what "CD" means?

It's a judgement call. But if an abbreviation gives someone significant pause, then it's better to write it out. CD and CCD give me significant pause. Please understand that.

@dhochbaum-dcp
Copy link

Let's say I want to add a function like this findMany but without a limit or offset.

Ahh, it would still be findMany but the function signature wouldn't take a limit parameter. This is how we handle boroughs, which we do not paginate.

I prefer using findMany for both cases over using findAll for the non-limit case. I don't really see a case where we want a limit and a non-limit version to exist in the same domain- we either want to paginate that data or we don't. We can use findMany and it still makes sense for either use case, meaning we can more easily refactor if we do want to add or remove pagination from that domain.

I see a case where we want to return all of the results of that findMany without a limit in order to generate a downloadable export.

@dhochbaum-dcp
Copy link

We recently had a discussion around abbreviating "city council district" and "community district" as "cd" and "ccd". The objection to using these initialisms is because it's difficult to remember what they mean. The general rule is that we should favor writing out the full word when practical.
However, we write bbl as the example parameter in this guidance. Also, Neighborhood Tabulation Areas are widely abbreviated as nta and it would be tedious to write it out every time. Id itself is an abbreviation of Identification. Clearly, there are terms we feel better served by abbreviating.
Personally, I think we should still prefer to write out terms. However, if a term has an abbreviation that has reached common use, it is better to use this abbreviation.

I'm unclear on what you mean by "common use". The common use of bbl has a different meaning for the general population than it does in-context for the users of our products. Those same people who see bbl and think "Borough, Block, and Lot" likely see cd and think "Community District". Are we writing out "Community District Tabulation Areas" or using "CDTA"? Are we expecting people who know what "CDTA" means not to know what "CD" means?

It's a judgement call. But if an abbreviation gives someone significant pause, then it's better to write it out. CD and CCD give me significant pause. Please understand that.

I understand that, and I greatly empathize with you. It seems like Tyler, you, and I are all on the same page - we should give great priority to requiring as little as possible reliance on memory.

@TangoYankee
Copy link
Member Author

Let's say I want to add a function like this findMany but without a limit or offset.

Ahh, it would still be findMany but the function signature wouldn't take a limit parameter. This is how we handle boroughs, which we do not paginate.
I prefer using findMany for both cases over using findAll for the non-limit case. I don't really see a case where we want a limit and a non-limit version to exist in the same domain- we either want to paginate that data or we don't. We can use findMany and it still makes sense for either use case, meaning we can more easily refactor if we do want to add or remove pagination from that domain.

I see a case where we want to return all of the results of that findMany without a limit in order to generate a downloadable export.

We haven't had to consider downloads. The closest we've come is with the "mvt" endpoints, which return files. For those, we've been using "findTiles" as the name. However, we may not want to keep this pattern for other file types. This is a chance to get ahead of a possible feature request by having a convention in place. Do you have something in mind?

I think we would want to consider:

  • The convention of the HTTP request we use to specify that it's a download:
    • Is it a path parameter?
    • A query parameter?
    • A request header?
    • Something else?
  • The operation Id in the OpenAPI spec
    • Should it be its own operation with its own name?
  • The controller name
  • The service name
  • The repository name
  • Do these names differ based on the type of file?

@dhochbaum-dcp
Copy link

Let's say I want to add a function like this findMany but without a limit or offset.

Ahh, it would still be findMany but the function signature wouldn't take a limit parameter. This is how we handle boroughs, which we do not paginate.
I prefer using findMany for both cases over using findAll for the non-limit case. I don't really see a case where we want a limit and a non-limit version to exist in the same domain- we either want to paginate that data or we don't. We can use findMany and it still makes sense for either use case, meaning we can more easily refactor if we do want to add or remove pagination from that domain.

I see a case where we want to return all of the results of that findMany without a limit in order to generate a downloadable export.

We haven't had to consider downloads. The closest we've come is with the "mvt" endpoints, which return files. For those, we've been using "findTiles" as the name. However, we may not want to keep this pattern for other file types. This is a chance to get ahead of a possible feature request by having a convention in place. Do you have something in mind?

I think we would want to consider:

  • The convention of the HTTP request we use to specify that it's a download:

    • Is it a path parameter?
    • A query parameter?
    • A request header?
    • Something else?
  • The operation Id in the OpenAPI spec

    • Should it be its own operation with its own name?
  • The controller name

  • The service name

  • The repository name

  • Do these names differ based on the type of file?

If we are going to continue to allow users to export data to CSV that matches the filters they have selected, we will need to do so dynamically. I don't have the answers to all of the things you just listed to consider, but my assumptions in asking were that we would need a function that returns the same results as that findMany, but without a limit; and that function would live in that same repository.

@dhochbaum-dcp
Copy link

I'm confused - why did #402 use agencyBudget instead of agencyBudgetCode?

@TangoYankee
Copy link
Member Author

I'm confused - why did #402 use agencyBudget instead of agencyBudgetCode?

You'll see in the edits that my penultimate version had it as agencyBudgetCode. I think I also had managingAgency as managingAgencyInitials. But then, I chickened out and changed them back. Horatio did opine in person about changing it to agencyBudgetCode but we never went through the process of having the discussion on the ticket.

So... it was a missed opportunity to be consistent

@TylerMatteo
Copy link
Collaborator

@dhochbaum-dcp @TangoYankee I'd like to revive this conversation and see if we can't get this merged in. Rereading the back and forth, it sounds like one point of contention is findMany versus findAll? As both of you pointed out, we do have use cases where we won't pass a limit but my thinking is that doesn't preclude us from using findMany. In my experience, folks tend to go with All or Many, not both, which I think makes sense for us as well. Part of why I prefer Many is that it can mean "up to and including the entire list" so I don't see why it couldn't work for those use cases where we don't have a limit. @dhochbaum-dcp am I missing anything?

On the topic of abbreviating, I think the three of us are generally on the same page but maybe it's worth thinking about how this documentation could be improved to give some clarity on when to abbreviate or not? This is something that is going to be gray area no matter what, which is fine and normal (fwiw, I came across API design documentation from the VA a while back that lists avoiding abbreviation as a "should").

@dhochbaum-dcp
Copy link

I think we can merge this in. I am fine with using findMany, and it looks like for functions that will generate CSV files, we will use findManyCSV.

@TylerMatteo
Copy link
Collaborator

I think we can merge this in. I am fine with using findMany, and it looks like for functions that will generate CSV files, we will use findManyCSV.

@TangoYankee Thoughts on @dhochbaum-dcp's comment here? Makes sense to me, though I might prefer findManyCsv if we're being specific.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants