Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Data Persistence with Spring Data JPA
Most microservices need to store state. Spring Data JPA provides a repository abstraction over JPA (Hibernate), significantly reducing boilerplate code. Real-world analogy: Think of JPA as a universal translator between your Java objects and your relational database. Your code speaks Java (objects, fields, methods), and your database speaks SQL (tables, columns, rows). Hibernate is the interpreter that converts between the two languages in real time. Spring Data JPA sits on top of Hibernate and acts like a personal assistant — you describe what data you want (via method names likefindByPriceLessThan), and it writes the SQL for you. You never have to learn the database’s dialect directly, though understanding it makes you far more effective when things go wrong.
1. Dependencies
Inpom.xml (or build.gradle):
2. Defining Entities
An Entity represents a table in your database.@Data generates equals() and hashCode() using all fields, including the @Id. This breaks JPA’s identity semantics: two Product objects representing the same DB row but loaded in different persistence contexts will have different identity if the ID is null (before persist). In production, use @Getter @Setter @ToString separately, and write a manual equals()/hashCode() based on the business key or use @EqualsAndHashCode(onlyExplicitlyIncluded = true) with @EqualsAndHashCode.Include on the id field.
3. The Repository Interface
This is where the magic happens. You don’t need to write implementation classes.4. Service Layer & Transactions
Business logic lives in the Service layer, not the controller.@Transactional Explained
Analogy: A transaction is like an “undo” button for your database. You group a set of operations together and say “either all of these succeed, or pretend none of them happened.” If you are transferring money between two bank accounts, you want to debit one AND credit the other. If the credit fails, the debit must be rolled back. That is a transaction.- Atomicity: Either all operations in the method succeed, or none do.
- Rollback: If a
RuntimeExceptionis thrown, the transaction rolls back automatically. Checked exceptions (likeIOException) do not trigger rollback by default — this catches many developers off guard. - Propagation: If one transactional method calls another, how do they relate? (Default
REQUIREDjoins the existing transaction).
@Transactional works via proxies. The proxy wraps your bean and intercepts method calls. But if the method is private, the proxy cannot intercept it, so the annotation is silently ignored. Your code runs without a transaction and you will not get an error — only mysterious data inconsistencies in production. Always use public methods for @Transactional.
5. H2 Console
When using H2 (in-memory DB), you can view the data in a browser. Add toapplication.properties:
http://localhost:8080/h2-console.
6. Projections
Sometimes you don’t want the full Entity. You just want a slice of data.7. The N+1 Query Problem
This is the most common performance killer in Hibernate, and it has sunk more production systems than most developers realize. Analogy: Imagine you are a teacher checking attendance. The N+1 approach is calling each student’s parent individually to ask “Is your child here today?” — one phone call per student. The JOIN FETCH approach is calling the school office once and getting the full attendance sheet for the entire class. Imagine: 1Author has N Books.
spring.jpa.properties.hibernate.generate_statistics=true and watch for high query counts. Tools like Hibernate Query Log or p6spy can also flag suspicious query patterns. In CI, you can even fail the build if a test exceeds a query count threshold using libraries like datasource-proxy.
8. Concurrency Control (Locking)
What if two users update the same product price at the exact same millisecond? This is the “Lost Update” problem — one user’s change silently overwrites the other’s. Analogy: Two people editing the same Google Doc paragraph at once. Without conflict detection, the last person to save wins and the first person’s edits vanish without a trace.Optimistic Locking (Recommended for most cases)
Add a@Version field. This is a “check before you write” strategy.
UPDATE product SET price = 10, version = 2 WHERE id = 1 AND version = 1.
If the version doesn’t match (someone else updated it between your read and write), it throws OptimisticLockException. No database locks are held — this is purely application-level conflict detection.
When to use: High-read, low-write workloads. Most CRUD APIs. Shopping carts, user profiles, product catalogs.
Pessimistic Locking
Lock the database row so no one else can read or write it until you are done.@QueryHints(@QueryHint(name = "jakarta.persistence.lock.timeout", value = "3000")) (3-second timeout).
9. Auditing
Keep track of “Who changed what and when” automatically.- Add
@EnableJpaAuditingto main class. - Add fields to Entity:
10. Testing with @DataJpaTest
Don’t use the full@SpringBootTest for DB tests (too slow). Use Slice Testing.
11. JPA Architecture
12. Deep Dive: Transaction Management
Handling transactions correctly is what separates seniors from juniors.Propagation Levels (@Transactional(propagation = ...))
| Level | Description | Use Case |
|---|---|---|
REQUIRED (Default) | Join existing transaction. If none, create new. | Most business logic. |
REQUIRES_NEW | Suspend current transaction. Create a brand new independent one. | Audit logging (save log even if main logic fails). |
MANDATORY | Must be called inside a transaction. Else throw Exception. | Helper methods that shouldn’t run standalone. |
SUPPORTS | Run in transaction if exists. Else run non-transactional. | Read-only operations. |
NOT_SUPPORTED | Suspend current transaction. Run non-transactional. | Sending emails/long processes (don’t hold DB lock). |
NESTED | Create a Savepoint within the existing transaction. | Complex rollbacks (try sub-task, if fail, rollback only sub-task). |
Isolation Levels (@Transactional(isolation = ...))
Defines “how much” one transaction sees of another.
- READ_UNCOMMITTED: Dirty Reads allowed. (Dangerous).
- READ_COMMITTED: PostgreSQL Default. No Dirty Reads.
- REPEATABLE_READ: No Non-Repeatable Reads. (MySQL Default).
- SERIALIZABLE: Full locking. Slowest but safest.
Rollback Rules
By default, Spring ONLY rolls back onRuntimeException (Unchecked).
It does NOT rollback on CheckedException (e.g., IOException).
This is one of the most dangerous default behaviors in Spring. Your method throws an IOException, the transaction commits the partial state, and you now have corrupted data in production. Nothing in the logs tells you the transaction committed — you only find out when a customer reports a wrong balance.
Fix:
rollbackFor = Exception.class on every @Transactional annotation. You can enforce this at the team level by creating a custom @BusinessTransaction meta-annotation:
13. High-Performance Caching
Caching is the easiest way to improve performance. Spring provides an abstraction over multiple caching providers.Enable Caching
Basic Usage
@Cacheable: If key exists in cache, return cached value. Else, execute method and cache the result.@CacheEvict: Remove from cache.@CachePut: Always execute method AND update cache.
Using Redis (Production)
By default, Spring usesConcurrentHashMap (in-memory). This works fine for a single instance, but the moment you scale to multiple pods, each pod has its own independent cache. User A hits Pod 1 (cache miss, loads from DB), then User A hits Pod 2 (another cache miss, loads from DB again). You have zero cache benefit under a load balancer.
For distributed systems, use Redis — a shared, external cache that all pods read from.
Dependency:
@Cacheable annotations work identically.
Pitfalls
- Serialization Issues: Your cached objects must be
Serializable. If you change a field name or type in your DTO, the old cached entries cannot be deserialized and you get runtimeClassCastException. Use Jackson for JSON-based serialization instead of Java serialization — it handles schema evolution gracefully. - Cache Stampede: If a popular cache entry expires, 1000 requests hit the database simultaneously. This can overwhelm the DB and cascade into a full outage. Use
@Cacheable(sync = true)to ensure only one thread computes the value while others wait. - Stale Data: Always define a TTL (Time to Live). Without one, cached data lives forever, and your users see stale prices, stale inventory counts, or stale permissions.
- Cache Aside vs. Read-Through: Spring’s
@Cacheableimplements the “cache aside” pattern (application manages the cache). For write-heavy workloads, consider “write-through” or “write-behind” strategies where the cache is updated on writes, not just reads.
Interview Deep-Dive
Explain how @Transactional works under the hood. What actually happens when Spring encounters this annotation, and why does calling a @Transactional method from within the same class not work?
Explain how @Transactional works under the hood. What actually happens when Spring encounters this annotation, and why does calling a @Transactional method from within the same class not work?
- At startup, Spring’s
BeanPostProcessor(specificallyInfrastructureAdvisorAutoProxyCreator) scans every bean. If a bean or its methods carry@Transactional, the BPP wraps the bean in a CGLIB proxy. The ApplicationContext stores the proxy, not your original object. When external code calls a@Transactionalmethod, the call hits the proxy first. - The proxy’s
TransactionInterceptorkicks in. It reads the annotation’s attributes (propagation, isolation, rollbackFor), asks thePlatformTransactionManager(typicallyJpaTransactionManagerfor JPA) to begin a transaction, then callsproceed()on the actual target method. If the method completes normally, it commits. If aRuntimeException(unchecked) orErroris thrown, it rolls back. Critically, checked exceptions do NOT trigger rollback by default — this catches many developers off guard. You need@Transactional(rollbackFor = Exception.class)to cover checked exceptions. - The self-invocation problem: when method A in
OrderServicecallsthis.methodB(),thisrefers to the raw target object, not the proxy. The proxy is an outer shell that intercepts calls from external callers. Internal calls bypass it entirely. SomethodB()’s@Transactionalannotation is invisible. - Solutions: (1) Move
methodB()to a separate@Serviceand inject it — the cleanest approach. (2) Self-inject the bean: injectOrderServiceinto itself and callself.methodB(), because the injected reference is the proxy. (3) UseAopContext.currentProxy()after enabling@EnableAspectJAutoProxy(exposeProxy = true)— this is brittle and not recommended for production.
REQUIRES_NEW suspends the outer transaction and creates a completely independent inner transaction. The TransactionManager obtains a second database connection from the pool. You now have two connections open simultaneously. If your connection pool max is 10 and you have deep REQUIRES_NEW nesting under concurrent load, you can exhaust the pool and deadlock — the outer transaction holds connection 1 and waits for the inner call to return, but the inner call cannot get connection 2. I have seen this take down production systems. Size your connection pool accounting for the maximum REQUIRES_NEW depth multiplied by concurrent request count.What is the N+1 query problem in Hibernate, and describe at least three different strategies to solve it with their trade-offs.
What is the N+1 query problem in Hibernate, and describe at least three different strategies to solve it with their trade-offs.
- The N+1 problem: when you load a parent entity with a
@OneToManylazy collection, Hibernate executes 1 query for the parents and then N additional queries (one per parent) when you access each collection. With 1000 authors, that is 1001 queries instead of 1 or 2. - Solution 1:
JOIN FETCHin JPQL —SELECT a FROM Author a JOIN FETCH a.books. Single SQL JOIN query. Trade-off: Cartesian product explosion. If an author has 10 books and you load 100 authors, the result set has 1000 rows. Hibernate deduplicates, but the database still sends 1000 rows over the wire. - Solution 2:
@EntityGraph(attributePaths = {"books"})on a repository method. Declarative alternative to JOIN FETCH. Same Cartesian trade-off, but cleaner and reusable across queries. - Solution 3:
@BatchSize(size = 50)on the collection. Hibernate batches:SELECT * FROM books WHERE author_id IN (?, ?, ..., ?)with 50 IDs at a time. For 1000 authors, 20 queries instead of 1000. No result set explosion because there is no JOIN. Often the best default for large datasets. - Solution 4: DTO projection —
SELECT new AuthorSummary(a.name, COUNT(b)) FROM Author a LEFT JOIN a.books b GROUP BY a.name. No N+1 because you load flat data, not entities. Most performant for read-only use cases.
spring.jpa.properties.hibernate.generate_statistics=true to log query counts per session. In tests, use datasource-proxy library to wrap your DataSource and assert: assertThat(queryCount).isLessThanOrEqualTo(3). In development, spring.jpa.show-sql=true with a formatter like p6spy exposes repeated SELECTs with different WHERE values — the telltale N+1 signature. For production, check slow query logs sorted by frequency, not duration. A 2ms query executed 10,000 times per request is worse than a 200ms query executed once.Explain optimistic locking vs. pessimistic locking in JPA. When would you choose each, and what are the failure modes?
Explain optimistic locking vs. pessimistic locking in JPA. When would you choose each, and what are the failure modes?
- Optimistic locking assumes conflicts are rare. A
@Versionfield is included in every UPDATE’s WHERE clause:UPDATE product SET price = 10, version = 2 WHERE id = 1 AND version = 1. If the version changed, zero rows update, and Hibernate throwsOptimisticLockException. The application must catch and retry. - Pessimistic locking acquires a database-level lock:
SELECT ... FOR UPDATE. Other transactions block until the lock is released at commit/rollback. Guarantees exclusive access but reduces throughput. - Choose optimistic for high-read, low-write scenarios. A product catalog read millions of times, prices changed occasionally. Zero locking overhead on reads. Conflicts are rare and cheap to handle.
- Choose pessimistic for high-contention, high-cost-of-failure scenarios. Seat reservations where 500 people book the last 10 seats simultaneously. With optimistic locking, 490 transactions fail and retry, creating a storm. With pessimistic, transactions queue orderly.
- Failure mode for optimistic: retry storms under contention. 100 concurrent transactions read version 1, all fail except one, all retry, 98 fail again. Exponential backoff with jitter is essential.
- Failure mode for pessimistic: deadlocks. Transaction A locks row 1, waits for row 2. Transaction B locks row 2, waits for row 1. The database kills one. Always acquire locks in consistent order (e.g., by PK ascending).
@Version field is automatically included in the UPDATE WHERE clause. The subtle case: if you detach() an entity, send it to a UI form, and later merge() it, the version check still applies. If another transaction updated the row between detach and merge, the merge throws OptimisticLockException. The version field becomes a concurrency token for the entire user workflow, not just one transaction. This is common in edit forms where the entity is detached for minutes before being saved.What are the different transaction isolation levels, and describe a concrete production bug caused by choosing the wrong one.
What are the different transaction isolation levels, and describe a concrete production bug caused by choosing the wrong one.
READ_UNCOMMITTED: Dirty reads allowed. Transaction A sees uncommitted changes from B. If B rolls back, A acted on phantom data. Almost never used except analytics where approximate counts suffice.READ_COMMITTED(PostgreSQL default): No dirty reads. But non-repeatable reads are possible — you read a row, another transaction modifies and commits it, you re-read and get a different value within the same transaction.REPEATABLE_READ(MySQL InnoDB default): Re-reading the same row always returns the same value within a transaction. MySQL uses MVCC snapshots. But phantom reads can occur — range queries might return different rows if another transaction inserts matching rows.SERIALIZABLE: Full isolation. Transactions execute as if sequential. Prevents all anomalies but kills throughput with range locks.- Concrete bug: A financial system uses
READ_COMMITTEDfor account transfers. The transfer method reads Account A balance (1000, first transaction uncommitted) and also deducts 600. WithREPEATABLE_READor optimistic locking, the second transaction would fail.
Connection.setTransactionIsolation() and the database engine enforces it. PostgreSQL’s SERIALIZABLE uses Serializable Snapshot Isolation (SSI), which allows concurrency but aborts on detected conflicts. MySQL’s SERIALIZABLE acquires shared locks on all reads, blocking writers with much worse throughput. If your app is database-agnostic, test isolation behavior on each database — do not just trust the label. The actual guarantees and performance characteristics vary dramatically between engines.