Service Discovery

In a microservices world, services change IP addresses dynamically (scaling up/down). Hardcoding URLs (http://localhost:8081) is impossible in production. We need a “Phonebook”.

1. Netflix Eureka Server

Eureka is a Service Registry. Services register themselves here, and other services query it to find them. Setup Eureka Server

Create a new Spring Boot project.
Dependency: spring-cloud-starter-netflix-eureka-server.
Annotation: @EnableEurekaServer.
Configuration (application.yml):

server:
  port: 8761

eureka:
  client:
    register-with-eureka: false # It is the server, doesn't need to register itself
    fetch-registry: false

2. Eureka Client (The Microservice)

Now, let’s connect a User Service to Eureka.

Dependency: spring-cloud-starter-netflix-eureka-client.
Configuration:

spring:
  application:
    name: USER-SERVICE # Important! This is the ID in the registry

eureka:
  client:
    service-url:
      defaultZone: http://localhost:8761/eureka

When you start the app, it will register with Eureka. Check http://localhost:8761 to see it listed.

3. Service-to-Service Communication

How does Order Service call User Service without knowing the IP?

The Flow: Client-Side Load Balancing

In the old days (Hardware LB), the heavy lifting was done by a central F5/NGINX. In Microservices, the Client (Order Service) is smart.

RestTemplate (Deprecated/Legacy)

@LoadBalanced // Magic annotation!
@Bean
public RestTemplate restTemplate() {
    return new RestTemplate();
}

// In Service
String url = "http://USER-SERVICE/users/" + userId;
UserDto user = restTemplate.getForObject(url, UserDto.class);

Spring intercepts the request, looks up USER-SERVICE in Eureka, gets a list of IPs, and uses Round Robin load balancing to pick one.

WebClient (Reactive & Modern)

The recommended way in modern Spring.

@Bean
@LoadBalanced
public WebClient.Builder webClientBuilder() {
    return WebClient.builder();
}

// In Service
UserDto user = webClientBuilder.build()
    .get()
    .uri("http://USER-SERVICE/users/{id}", userId)
    .retrieve()
    .bodyToMono(UserDto.class)
    .block(); // Blocking for sync code

Feign Client (Declarative)

The cleanest way. It looks like a JPA Repository but for HTTP calls.

Dependency: spring-cloud-starter-openfeign.
Annotation: @EnableFeignClients.

@FeignClient(name = "USER-SERVICE")
public interface UserClient {

    @GetMapping("/users/{id}")
    UserDto getUser(@PathVariable("id") Long id);
}

Now just autowire UserClient and call getUser(id). Spring handles the lookup, load balancing, and HTTP serialization!

Interview Deep-Dive

Compare client-side service discovery (Eureka) with server-side service discovery (Kubernetes DNS / AWS ALB). What are the trade-offs, and when would you pick one over the other?

Strong Answer:

Client-side discovery (Eureka pattern): each service instance registers itself with a registry (Eureka Server). The calling service fetches the registry, caches it locally, and picks an instance using a client-side load balancer (Spring Cloud LoadBalancer). The client is “smart” — it knows about all available instances and can implement sophisticated routing (weighted, zone-aware, canary).
Server-side discovery (Kubernetes Services, AWS ALB): the caller sends requests to a stable DNS name or virtual IP. The platform (Kubernetes kube-proxy, AWS load balancer) routes to a healthy instance. The client is “dumb” — it just calls a single address and the infrastructure handles the rest.
Trade-off 1 — Complexity ownership: client-side pushes complexity into the application (every service needs the Eureka client library, which means Java-only or language-specific implementations). Server-side pushes complexity into infrastructure (platform manages routing, any language can participate via DNS).
Trade-off 2 — Staleness: with client-side, the registry cache can be stale. Eureka’s default heartbeat interval is 30 seconds, and the client cache refreshes every 30 seconds. A newly crashed instance might receive traffic for up to 60 seconds. Kubernetes services update in near-real-time via the API server watch mechanism.
Trade-off 3 — Flexibility: client-side load balancing allows application-aware routing (send 10% of traffic to the canary version, route premium users to dedicated instances). Kubernetes requires a service mesh (Istio) to achieve the same. With Eureka and Spring Cloud LoadBalancer, you can write custom ServiceInstanceListSupplier implementations.
My recommendation: if you are already on Kubernetes, use Kubernetes DNS for basic discovery and skip Eureka. It reduces operational overhead (no Eureka cluster to manage) and works across languages. Add a service mesh only if you need advanced traffic management. If you are running on bare VMs or a non-K8s platform, Eureka is a solid choice.

Follow-up: What happens to in-flight requests when a service instance is shutting down? How does graceful shutdown work with Eureka vs. Kubernetes?With Eureka: the instance sends a deregistration request on shutdown. But there is a lag — other clients still have the old cache for up to 30 seconds. During that window, requests are routed to the dying instance. Spring Boot’s graceful shutdown (server.shutdown=graceful) helps: it stops accepting new connections but finishes processing in-flight requests within a configurable timeout (spring.lifecycle.timeout-per-shutdown-phase=30s). With Kubernetes: the pod receives a SIGTERM, and Kubernetes removes the pod from the Service’s endpoint list. But there is a race condition — kube-proxy on other nodes may not have updated yet. The standard fix: add a preStop hook with a small sleep (sleep 5) to allow endpoint propagation before the app starts shutting down. Both approaches require the application to handle the transition window properly.

The @LoadBalanced annotation on a RestTemplate or WebClient seems like magic. Explain what it actually does at the Spring framework level.

Strong Answer:

@LoadBalanced is a marker annotation (qualifier) that triggers a specific BeanPostProcessor. When Spring Boot’s auto-configuration detects a RestTemplate or WebClient.Builder bean annotated with @LoadBalanced, it registers a LoadBalancerInterceptor (for RestTemplate) or ReactorLoadBalancerExchangeFilterFunction (for WebClient) on that bean.
Here is what happens at request time: you call restTemplate.getForObject("http://USER-SERVICE/users/1", UserDto.class). The interceptor sees that the hostname is USER-SERVICE (not a real DNS name). It passes this name to ReactiveLoadBalancer, which consults the ServiceInstanceListSupplier chain. The supplier asks the discovery client (Eureka, Kubernetes, or a static list) for all instances of USER-SERVICE. The load balancer picks one using its algorithm (Round Robin by default). The interceptor rewrites the URL to http://10.0.1.5:8081/users/1 and forwards the request.
The key insight: @LoadBalanced does not change the RestTemplate’s behavior. It adds an interceptor to the interceptor chain. Without the annotation, RestTemplate would try to resolve USER-SERVICE via DNS, fail, and throw UnknownHostException. With it, the interceptor intercepts before DNS resolution.
The gotcha: if you have both a @LoadBalanced RestTemplate and a regular RestTemplate bean, autowiring becomes ambiguous. Use @Qualifier("loadBalanced") or separate them with distinct bean names. Also, @LoadBalanced only works on beans — if you create a RestTemplate with new RestTemplate() inside a method, it will not be load-balanced.

Follow-up: How does the client-side load balancer handle instance health? What if it routes to an instance that Eureka says is UP but is actually unhealthy?Eureka’s health check is based on heartbeats (every 30 seconds). An instance can be functionally broken (returning 500 on every request) while still sending heartbeats — Eureka marks it as UP. Spring Cloud LoadBalancer supports a HealthCheckServiceInstanceListSupplier that actively probes instances before including them in the rotation. You configure it with spring.cloud.loadbalancer.health-check.path=/actuator/health. However, this adds per-request overhead. A more practical approach: combine Eureka discovery with circuit breakers (Resilience4j). If the circuit opens for a specific instance, the retry mechanism tries the next instance in the list. This way, you tolerate Eureka’s staleness because the circuit breaker provides real-time failure detection at the application level.

Compare RestTemplate, WebClient, and Feign Client for service-to-service communication. In what scenario would each be the best choice?

Strong Answer:

RestTemplate: synchronous, blocking HTTP client. Simple API, well-understood. But each call blocks a thread. In a service handling 200 concurrent requests where each call waits 500ms for a downstream service, you need 200 threads just for waiting. Thread creation, context switching, and memory overhead become the bottleneck. Spring has marked it as “in maintenance mode” — no new features, but not deprecated.
WebClient: non-blocking, reactive HTTP client built on Project Reactor. Returns Mono or Flux instead of blocking. A single thread can handle thousands of concurrent HTTP calls because it uses event-loop-based I/O (Netty). Even if you are not building a fully reactive application, WebClient is the recommended choice for making HTTP calls because it handles backpressure and supports streaming. You can call .block() to use it synchronously in traditional servlet-based apps.
Feign Client: declarative HTTP client. You define an interface with annotations (@GetMapping, @PathVariable) and Spring generates the implementation. It reads like a JPA repository for HTTP. Advantages: extremely clean code, integrates with Eureka and Resilience4j out of the box, supports request/response interceptors for cross-cutting concerns (adding auth headers). Disadvantage: by default synchronous and blocking (though async support is available via CompletableFuture return types).
Best fit: Feign when you have many stable internal APIs and want clean, declarative code (most microservice-to-microservice calls). WebClient when you need non-blocking I/O, streaming, or are calling external APIs with variable latency. RestTemplate only in legacy codebases where migration cost is not justified.

Follow-up: How do you add retry logic and circuit breaking to a Feign Client without polluting the interface definition?Feign integrates with Resilience4j via spring-cloud-starter-circuitbreaker-resilience4j. You configure circuit breakers in application.yml using the Feign client name as the instance name. For retries, Spring Cloud provides a Retryer bean that Feign uses automatically. You can also layer Resilience4j’s @Retry on the service method that calls the Feign client. The key principle: the Feign interface stays clean (just HTTP contract), and resilience configuration is externalized to YAML or applied at the service layer via annotations.

Foundations API Gateway

Documentation Index

​Service Discovery

​1. Netflix Eureka Server

​2. Eureka Client (The Microservice)

​3. Service-to-Service Communication

​The Flow: Client-Side Load Balancing

​RestTemplate (Deprecated/Legacy)

​WebClient (Reactive & Modern)

​Feign Client (Declarative)

​Interview Deep-Dive

Service Discovery

1. Netflix Eureka Server

2. Eureka Client (The Microservice)

3. Service-to-Service Communication

The Flow: Client-Side Load Balancing

RestTemplate (Deprecated/Legacy)

WebClient (Reactive & Modern)

Feign Client (Declarative)

Interview Deep-Dive