Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So frustrating. You got like 80% of the way there, and then went "nope, too much work" and diverted to add more complexity.

The answer is to write the SQL yourself, and the scan methods yourself. Code generation is better than ORM, but still a wrapper, still adds complexity, and still brings problems.

Yes it's a pain in the arse to write all that boilerplate in one go (pun intended). But if you'd started without an ORM you'd have written them one at a time as you needed them and barely noticed it.

Keeping your code aligned with your database schema is very little effort - database schema changes are usually rare and significant.

I write a view for each access method (so I can change the schema without worrying about changing every access method), and a function for each update/insert/delete (for the same reasons). It takes maybe 20 mins to write the whole set each time I have to add a feature with new data, which is a rounding error in the time it takes to write all the rest of it.

The point is that the database schema is optimised to storing data in the best way possible. The middle layer is optimised for processing data in the best way possible, and the front end is optimised for displaying data in the best way possible. None of these three things are equal. Use an interface between each of them. The interface is important and needs to be carefully considered.



  > I write a view for each access method (so I can change the schema without worrying about changing every access method), and a function for each update/insert/delete (for the same reasons).
God how I hate supporting such architecture. It loses edit history unless you're very careful (which many developers are not) and do absolutely every piece of DDL in a versioned migration, and makes DDL extremely painful because you frequently have to drop two dozen functions and views (which depend on each other recursively) and then recreate them.


We’ve started using views and friends a lot more recently (now I’ve recovered from my PTSD from a lifetime ago when I worked in a poorly managed sql server shop). As you point out, they’re often not first class citizens.

On the flip side, we’ve just moved ours to files on disk (one per object) and hacked in some tooling to autogenerate the migrations from them (we use alembic / sqlalchemy which already does a good job here).

It’s kinda the promised land. Edit code on disk, commit and review as usual.


this. Don't write your SQL into the database. Write them as scripts (starting with "drop XYX", then "create XYZ", check them into git, and treat them as code.

Migrations are for schema changes. Views and functions are not schema.


Wait. People don't do this?


In the dark times even SCM was rare. We were ignorant cowboys strait out of school. We learned things the hard way. Then found out most such problems were solved decades before.


No, most don't


I understand why people might hate the idea of using raw SQL when I suggest it then! I always assumed it’d be version controlled and 100% replicable (minus data) from sql scripts


> I always assumed it’d be version controlled and 100% replicable (minus data) from sql scripts

It probably goes a bit deeper than that:

Some have version controlled migrations, in both directions (both adding changes and reverting them) and have logic that allows getting to the current state from an empty DB/schema, even generating some test data (seeding) along the way if needed, e.g. in a dev/test environment.

Others probably are half way there, utilizing whichever of those practices make sense, though sometimes using all of them is impossible for either social reasons or historical ones (e.g. the first migrations weren't properly versioned, so without a full data/schema import or a baseline version, it's impossible to run all of the migrations sequentially and start with an empty DB/schema.

Others don't bother with any of that and handle everything in an ad hoc manner, which may work for them, but personally i'd strongly advise against that, because in cases like that touching the DB becomes a lot more risky, unless you do a backup/export before any of your own changes.


I’ve worked in all those places (currently in the fully versioned model).

We occasionally do a full reset using an export of the current schema as a new base and then start versioning again from there. So there’s not really a good technical reason to stop people from getting a working runnable migration system going - other than social, as you say.


> So there’s not really a good technical reason...

One that i can think of is the application depending on some set of data being present in the system, which was added ages ago and that no one has any idea about across the hundreds of tables.

Furthermore, you probably can't just do a full export/import of the initial state, because everyone having to download about 2 GB of data a large portion of which is auditing related information probably also isn't all that good.

Well, that's more of a technical aspect that's caused by social factors, to be honest, but probably is a blocker nonetheless. Especially if you can't convince anyone to spend a week exploring the schema and setting up this baseline migration & data, when things currently "work well enough" and when the company doesn't give you a data storage solution to keep it at (e.g. NextCloud), doesn't let you use Git LFS or something like that, anyways...


the data in your system is massively more important than the code. Not spending a week exploring and completely understanding your schema is going to hurt you in the long term.


I spent years writing ad-hoc SQL, and had no occasion to learn about or use ORMs, but I did dabble a little in T-SQL and PL/SQL, and they just seemed wrong to me. Not integrated in a clean manner, not derived from a single vision, and so on. I guess what people call an "impedance mismatch". There is excessive overhead just from switching from SQL to the imperative language. I've gotten four orders of magnitude speedup by reformulating a T-SQL script as a single SQL query.

But recently I learned about Power Query (the "M" language standing for Mashup) and I immediately wanted so badly for SQL to acquire all the good aspects of it. Not as a separate module, but perhaps as a successor language.

In particular, the way it allows functions (and higher-order functions) to be defined is wonderful.

Many people don't like SQL, but I never had the feeling that here is what it's missing until now.

I qualify my comments above by mentioning I have no experience with OLTP or in general SQL that needs to be performant on the millisecond to second timescale.

Instead, I've always been working with queries on the basis that they need to run in 5 minutes, or ten, or fifteen, or an hour, as opposed to a day, or a week, or a month.


I'm shocked more people arent talking about SQLBoiler in threads like these. It solves this exact problemset. You write the SQL schema and it generates all the scan and helper functions for you. We've had a great experience with it at work after running into similar woes as OP with ORM's.

https://github.com/volatiletech/sqlboiler


What are your thoughts on what the article author has to say about SQLBoiler?


For larger and longer running projects, I prefer sqlc. There is zero magic involved, but it still generates the boilerplate for you. If for some reason you decide you don't like it, you can just pretend it never existed.


> I write a view for each access method (so I can change the schema without worrying about changing every access method), and a function for each update/insert/delete (for the same reasons).

Database view and stored procedures?


yeah. Postgres has functions, SQL Server has sprocs.


Putting all SQL into the database gets you two huge advantages: First, all relations and dependencies are visible for DDL changes, and what the code uses is easy to see without looking through the code. Second, a test database can use (near) live data with the new DDL and new code, making testing much better especially for non-development staff who can spot issues based on the data or UX.


> I write a view for each access method (so I can change the schema without worrying about changing every access method), and a function for each update/insert/delete (for the same reasons)

I’d appreciate it if you could expand on what this means exactly.


Not OP, but sounds like they're using views for reads and functions (stored procs) for writes.

Idea is to insulate code from schema changes as much as possible. If the underlying schema changes, it's possible to sync the views/functions to those changes so the code itself isn't impacted, as it's just talking to the views and functions. Kind of an exposed interface to the DB.

It's not perfect, and some schema changes will invariably require code changes, but should be minimized.

There are pros and cons to this approach and I personally don't favor it, although I am aware of some shops that made it an absolute requirement--especially pre-ORM.


What is the idea behind using views? I have never used them


Not OP but two main things:

1. Access control - you can create new SQL accounts and only give them access to select from specified views

2. Encapsulation and implementation hiding - the underlying table structure can change as long as the views exposed doesn't change.

At least that's all I know. I'm sure people will add or correct when they read this.


These are correct - some others are:

- ability to do transformations uniformly e.g. if you want to represent a value in a certain way uniformly instead of having to do that in multiple places

- ability to get generated values (like 'create_ts() - now()' to get the age of the record) directly from the database instead of computing it repetitively

- move the join mechanisms on the database side to do those efficiently instead of the ORM having to do it (sometimes suboptimally); also helps with joins that are repetitive

- ability to apply certain "default" filters e.g. if you are always fetching with predicate 'active = true', then you can include that in the view


I used views several months ago when I implemented a tracking database for AWS SSM parameters: I wanted to retain a history of all changes to the parameters over time, and the schema was complex enough that queries for simple things like “what’s the current value?” weren’t simple at all.

So, I wrapped common queries in views.

(None of this should be taken as advice for how to approach such a problem.)


What if there are no layers and you can’t think outside hierarchy?

There’s no reason for anyone else to take your view as sacrosanct since many many many apps and services are working just fine without ORMs.

You might accomplish something interesting today if you set aside protecting unimportant ephemera you talked yourself into believing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: