Skip to content

Commit

Permalink
Implemented capability to separate diff logs via log4j2 (#315)
Browse files Browse the repository at this point in the history
* Implemented capability to separate diff logs via log4j2

* Minor changes to log config

* Log4j2 fixes
  • Loading branch information
pravinbhat authored Oct 4, 2024
1 parent dca6d3a commit 42650d9
Show file tree
Hide file tree
Showing 8 changed files with 98 additions and 33 deletions.
15 changes: 13 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ Note:
--class com.datastax.cdm.job.DiffData cassandra-data-migrator-4.x.x.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
```

- Validation job will report differences as “ERRORS” in the log file as shown below
- Validation job will report differences as “ERRORS” in the log file as shown below.

```
23/04/06 08:43:06 ERROR DiffJobSession: Mismatch row found for key: [key3] Mismatch: Target Index: 1 Origin: valueC Target: value999)
Expand All @@ -79,6 +79,17 @@ Note:

- Please grep for all `ERROR` from the output log files to get the list of missing and mismatched records.
- Note that it lists differences by primary-key values.
- If you would like to redirect such logs into a separate file, you could use the `log4j2.properties` file [provided here](./src/resources/log4j2.properties) as shown below

```
./spark-submit --properties-file cdm.properties \
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" \
--conf "spark.executor.extraJavaOptions='-Dlog4j.configurationFile=log4j2.properties'" \
--conf "spark.driver.extraJavaOptions='-Dlog4j.configurationFile=log4j2.properties'" \
--master "local[*]" --driver-memory 25G --executor-memory 25G \
--class com.datastax.cdm.job.DiffData cassandra-data-migrator-4.x.x.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
```

- The Validation job can also be run in an AutoCorrect mode. This mode can
- Add any missing records from origin to target
- Update any mismatched records between origin and target (makes target same as origin).
Expand All @@ -102,7 +113,7 @@ Note:
```

# Perform large-field Guardrail violation checks
- The tool can be used to identify large fields from a table that may break you cluster guardrails (e.g. AstraDB has a 10MB limit for a single large field) `--class com.datastax.cdm.job.GuardrailCheck` as shown below
- The tool can be used to identify large fields from a table that may break you cluster guardrails (e.g. AstraDB has a 10MB limit for a single large field), use class option `--class com.datastax.cdm.job.GuardrailCheck` as shown below

```
./spark-submit --properties-file cdm.properties \
Expand Down
3 changes: 2 additions & 1 deletion RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Release Notes
## [4.4.2] - 2024-10-TBD
## [4.5.0] - 2024-10-03
- Upgraded to use log4j 2.x and included a template properties file that will help separate general logs from CDM class specific logs including a separate log for rows identified by `DiffData` (Validation) errors.
- Upgraded to use Spark `3.5.3`.

## [4.4.1] - 2024-09-20
Expand Down
8 changes: 4 additions & 4 deletions SIT/cdm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -91,14 +91,14 @@ if [ $argErrors -ne 0 ]; then
_usage
fi

if [ ! -f /local/log4j.xml ]; then
cd /local && jar xvf /local/cassandra-data-migrator.jar log4j.xml && cd -
if [ ! -f /local/log4j2_docker.properties ]; then
cd /local && jar xvf /local/cassandra-data-migrator.jar log4j2_docker.properties && cd -
fi

spark-submit --properties-file "${PROPERTIES}" \
--master "local[*]" \
--conf "spark.driver.extraJavaOptions=-Dlog4j.configurationFile=file:///local/log4j.xml -Dcom.datastax.cdm.log.level=DEBUG" \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configurationFile=file:///local/log4j.xml -Dcom.datastax.cdm.log.level=DEBUG" \
--conf "spark.driver.extraJavaOptions=-Dlog4j.configurationFile=file:///local/log4j2_docker.properties -Dcom.datastax.cdm.log.level=DEBUG" \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configurationFile=file:///local/log4j2_docker.properties -Dcom.datastax.cdm.log.level=DEBUG" \
--class ${CLASS} \
/local/cassandra-data-migrator.jar

12 changes: 12 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,18 @@
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-1.2-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j-impl</artifactId>
Expand Down
3 changes: 0 additions & 3 deletions src/main/java/com/datastax/cdm/properties/PropertyHelper.java
Original file line number Diff line number Diff line change
Expand Up @@ -173,8 +173,6 @@ public List<Number> getNumberList(String propertyName) {

@Override
public List<Integer> getIntegerList(String propertyName) {
List<Integer> intList = new ArrayList<>();
Integer i;
if (null == propertyName || PropertyType.NUMBER_LIST != getType(propertyName)
|| null == getNumberList(propertyName))
return null;
Expand All @@ -188,7 +186,6 @@ public Boolean getBoolean(String propertyName) {

@Override
public String getAsString(String propertyName) {
String rtn;
if (null == propertyName)
return null;
PropertyType t = getType(propertyName);
Expand Down
16 changes: 0 additions & 16 deletions src/resources/log4j.xml

This file was deleted.

59 changes: 59 additions & 0 deletions src/resources/log4j2.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Copyright DataStax, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

appender.0.type = Console
appender.0.name = CONSOLE
appender.0.layout.type = PatternLayout
appender.0.layout.pattern = %d %-5p [%t] %c{1}:%L - %m%n

appender.1.type = RollingFile
appender.1.name = MAIN
appender.1.fileName = cdm_logs/cdm.log
appender.1.filePattern = cdm_logs/cdm.%d{yyyy-MM-dd-HHmm}.%i.log
appender.1.layout.type = PatternLayout
appender.1.layout.pattern = %d %-5p [%t] %c{1}:%L - %m%n
appender.1.policy.type = Policies
appender.1.policy.0.type = OnStartupTriggeringPolicy
appender.1.policy.1.type = SizeBasedTriggeringPolicy
appender.1.policy.1.size = 10m
appender.1.strategy.type = DefaultRolloverStrategy
appender.1.strategy.max = 100

appender.2.type = RollingFile
appender.2.name = DIFF
appender.2.fileName = cdm_logs/cdm_diff.log
appender.2.filePattern = cdm_logs/cdm_diff.%d{yyyy-MM-dd-HHmm}.%i.log
appender.2.layout.type = PatternLayout
appender.2.layout.pattern = %d %-5p [%t] %c{1}:%L - %m%n
appender.2.policy.type = Policies
appender.2.policy.0.type = OnStartupTriggeringPolicy
appender.2.policy.1.type = SizeBasedTriggeringPolicy
appender.2.policy.1.size = 10m
appender.2.strategy.type = DefaultRolloverStrategy
appender.2.strategy.max = 100

rootLogger.level = INFO
rootLogger.appenderRef.0.ref = CONSOLE
rootLogger.appenderRef.0.level = INFO

logger.0.name = com.datastax.cdm
logger.0.level = INFO
logger.0.additivity = false
logger.0.appenderRef.0.ref = MAIN

logger.1.name = com.datastax.cdm.job.DiffJobSession
logger.1.level = ERROR
logger.1.additivity = false
logger.1.appenderRef.0.ref = DIFF
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,11 @@
# limitations under the License.
#

# Root logger option
log4j.rootLogger=INFO, stdout
# Direct log messages to stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p [THREAD ID=%t] %c{1}:%L - %m%n
appender.0.type = Console
appender.0.name = CONSOLE
appender.0.layout.type = PatternLayout
appender.0.layout.pattern = %d %-5p [%t] %c{1}:%L - %m%n

rootLogger.level = INFO
rootLogger.appenderRef.0.ref = CONSOLE
rootLogger.appenderRef.0.level = INFO

0 comments on commit 42650d9

Please sign in to comment.