Add micrometer to your pom.xml:
- spring-boot-starter-actuator
- io.micronaut.micrometer:micronaut-micrometer-registry-prometheus:3.3.0
Expose metrics for prometheus, by adding this to your
management.endpoints.web.exposure.include=* management.endpoint.prometheus.enabled=true -
Make sure http://localhost:8080/actuator/prometheus returns data
install prometheus and add this to prometheus.yml as a child of scrape_configs
to tell it to poll the spring boot actuator metrics
- job_name: 'spring'
metrics_path: '/actuator/prometheus'
- targets: ['localhost:8081']
Check that Prometheus is running at localhost:9090
To query, display and monitor the metrics from Prometheus
- Install grafana locally or run the
in this folder - Go to grafana at localhost:3000
- login with admin/admin
- Add a datasource to Prometheus with the url http://host.docker.internal:9090
- Install the dashboard https://grafana.com/grafana/dashboards/6756
- Set as "Instance" variable the value:
- Monitor the "Connection Timeout" count and "Acquire Time"
- Add a panel to display shepard response times. IN the formula, enter
sum by ()(rate(shepard_seconds_sum[5m])) / sum by ()(rate(shepard_seconds_count[5m]))
zipkin tracing 1->main->2 si 1->2
un apel http POST din main in altul -> inlocuit cu mq
+service2 + redis pe el
pus errori prin cod.
clone performance-microservices git from the mail yesterday + import in IntelliJ
install Docker Desktop
start this docker/docker-compose.yml (a lot of instances, god help RAM)
start StartWireMock.java
---- an endpoint is slow even when load tested alone with Gatling in local environmet
go to glowroot.org and download and unzip
copy the full path to glowroot.jar
in IntelliJ open the run configuration of Service2App and add to the VM arguments: -javaagent:<path_to_glowroot_jar>
start Service2App
go to localhost:8082/1 -> it 200
Run EngineJava in IntelliJ and select the simulation "Service2_GetByIdSimulation" (type the number and ENTER + ENTER (empty desr))
Click the link to the Gatling report -> response time 100ms mean with normal distribution
go to localhost:4000 in the left side click /1 go to "Thread Profile" tab > click flame graph
the bottleneck was the findById because there were only 20 onnections by default in the Hikari conenction pool spring.datasource.hikari.maximum-pool-size=60
restart the app (for glowroot to wipe its database)
run the simulation -> open the report = 80ms mean
=== Java-level deadlocks === Deadlock = two parallel flows wait for a resource owned by the other () indefinetly
- Database Table LOCK TABLE(table-lock); SELECT FOR UPDATE (row-lock)
- java-lock: syncronized < really in 2023 the year of the Lord NEVER ! do not pass lambdas to synchronized methods eg. list.removeIf map.computeIfAbsent
- Redis lock
Exercise #1 - main():
- download https://visualvm.github.io/ (as standalone) and start it
- Run Deadlock.java -> the execution hangs
- in visualvm connect to the blocked process, and go to "Threads" tab -> "Deadlock Detected"
Note: Flamegraph will NOT point out the problem, as time waiting for 'synchronized' happens outside of Java (in C code)
Exercise #2 - JFR on a running API:
- Download Java Mission Control https://jdk.java.net/jmc/8/ and open it
- Start Service2
- In JMC, click on Service 2 in VM browser > "Start flight recording"
- Start Search simulation and wait for it to complete
- END the recording in Java Mission Control => in JMC, the automatic analysis (first screen) points out Java-Lock Contention were a problem pointing at the instance on which the threads synchronized() on, and showint what threads were blocked and for how long
JFR Inside the JVM : ->.jfr a) records stack traces -> flame graphs b) records events -> JVM internals Mission Control -> load and analyze a .jfr output Gatling VisualVM deadlocks
open -a visualvm --args --jdkhome /usr/libexec/java_home -v 17