Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11975. wrap TermIndex in ExecutionContext #7602

Merged
merged 4 commits into from
Jan 8, 2025

Conversation

sumitagrawl
Copy link
Contributor

@sumitagrawl sumitagrawl commented Dec 20, 2024

What changes were proposed in this pull request?

Wrap TermIndex in ExecutionContext. ExecutionContext.getIndex() will have same value as termIndex.

There is no logic change.

Later, with below Jira, dependency with ratisIndex will be removed,
https://issues.apache.org/jira/browse/HDDS-11901

Parent Jira:
https://issues.apache.org/jira/browse/HDDS-11897

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11975

How was this patch tested?

Existing testcase

@sumitagrawl sumitagrawl marked this pull request as ready for review December 20, 2024 13:55
Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sumitagrawl for the patch.

I think we can keep using TermIndex even if the index is not generated by Ratis, similar to how it's used currently when Ratis is disabled in OM. TermIndex is just an interface, we can define custom implementation, if additional data is needed for generating our own index. Thus we could avoid wrapping it in yet another object, and skip refactoring a bunch of code.

@sumitagrawl
Copy link
Contributor Author

Thanks @sumitagrawl for the patch.

I think we can keep using TermIndex even if the index is not generated by Ratis, similar to how it's used currently when Ratis is disabled in OM. TermIndex is just an interface, we can define custom implementation, if additional data is needed for generating our own index. Thus we could avoid wrapping it in yet another object, and skip refactoring a bunch of code.

@adoroszlai ExecutionContext here have just one entry, but in new flow, it’s context for execution which will contains other members as required in future like both index and termIndex. Additional other parameters.
It’s problem to change interface for every new parameters.

Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sumitagrawl for the patch. Pls check comment.

@adoroszlai
Copy link
Contributor

ExecutionContext here have just one entry, but in new flow, it’s context for execution which will contains other members as required in future like both index and termIndex. Additional other parameters. It’s problem to change interface for every new parameters.

  • in POC (HDDS-11418. leader executor framework [prototype] #7406) ExecutionContext has one additional member: batchOperation
  • ExecutionContext is just a holder for "stuff", with get/set methods for each member
  • index/termIndex and batchOperation are used in unrelated places
  • instances of ExecutionContext may or may not have these members set
  • RequestContext is also a holder for "stuff"
  • RequestContext even has entry for ExecutionContext

So I think ExecutionContext is unnecessary and prone to be misused.

Why not put additional other parameters (batchOperation for now) in RequestContext?

@sumitagrawl
Copy link
Contributor Author

sumitagrawl commented Dec 30, 2024

ExecutionContext here have just one entry, but in new flow, it’s context for execution which will contains other members as required in future like both index and termIndex. Additional other parameters. It’s problem to change interface for every new parameters.

  • in POC (HDDS-11418. leader executor framework [prototype] #7406) ExecutionContext has one additional member: batchOperation
  • ExecutionContext is just a holder for "stuff", with get/set methods for each member
  • index/termIndex and batchOperation are used in unrelated places
  • instances of ExecutionContext may or may not have these members set
  • RequestContext is also a holder for "stuff"
  • RequestContext even has entry for ExecutionContext

So I think ExecutionContext is unnecessary and prone to be misused.

Why not put additional other parameters (batchOperation for now) in RequestContext?

@adoroszlai RequestContext is having wider information which is not required for any execution as can have misuse. Prototype and actual code will have some difference.
ExecutionContext:

  • index (ozone managed index)
  • ratisIndex (as present for old flow, may not be required once old flow is removed)

PostExecutionContext (new class for specific case)

  • ExecutionContext
  • batchOperation
  • ...

Current way of passing all parameter via method signature is not a recommended practice, that is the reason we have findbug to restrict parameter count less than six. But we have misuse that by supressing. Recommended practice is to pass via pojo if number of paramenter increases.

Another mis-use done due to method restirction is use of Thread-Local.
Another mis-use, setting request object and other parameter in Base Class, and making complicated with derivation.

ExecutionContext (as pojo) is recommended way to pass parameter if parameter can increase, or have scope of expansion. I think with this, it can provide expansion of feature without impacting existing implementation.

So I think above explanation clarify its usecase and better way of coding (as suggested by findbug also).

We can not put all parameter now itself. Prototype implementation and actual can have little difference. Certain new area is discovered with respect to prototype, and are part of design doc.

Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactoring changes LGTM +1. Follow up PR will bring Ozone Managed index changes as discussed.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sumitagrawl for the explanation. I'm still not convinced ExecutionContext is needed, but can accept it.

Comment on lines 148 to 150
ExecutionContext context = new ExecutionContext();
context.setTermIndex(TransactionInfo.getTermIndex(transactionLogIndex));
context.setIndex(transactionLogIndex);
Copy link
Contributor

@adoroszlai adoroszlai Jan 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please improve safety of ExecutionContext:

  1. Remove mutators (set...)
  2. Create private constructors
  3. Make member variables final
  4. Introduce factory methods like:
  public static ExecutionContext of(long index)
  public static ExecutionContext of(TermIndex termIndex)

This reduces steps required in code that creates instances, and ensures only valid ExecutionContext can be created:

Suggested change
ExecutionContext context = new ExecutionContext();
context.setTermIndex(TransactionInfo.getTermIndex(transactionLogIndex));
context.setIndex(transactionLogIndex);
return validateAndUpdateCache(ozoneManager, ExecutionContext.of(transactionLogIndex));

If you plan to add many more members, then add builder instead of factory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adoroszlai updated ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sumitagrawl for updating the patch.

Original suggestion simplifies callers and also ensures index and termIndex.getIndex() are consistent:

  public static ExecutionContext of(long index)
  public static ExecutionContext of(TermIndex termIndex)

Copy link
Contributor Author

@sumitagrawl sumitagrawl Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adoroszlai
This will be different in next PR, so created having together. Only this is to have most files impact to common place in this PR. So above can not be followed.

index: OM managed index -- it will be different of ratis index.
termIndex: Ratis index

But both is required to be part of ExecutionContext, used public static ExecutionContext of(long index, TermIndex termIndex)

Once number of params increases, will change to Builder pattern on need basis, which will have only few place to change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you have non-Ratis index and Ratis TermIndex with different values at the same time?

Copy link
Contributor Author

@sumitagrawl sumitagrawl Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes,
ratis index is used in upgrade prepare and snapshot flow for specific purpose -- comparing if flush is completed or not.

Default, will be using OM managed index (non-ratis).

Once if removed ratis index in above cases with new flow, will remove this parameter in this class.

@sumitagrawl sumitagrawl requested a review from adoroszlai January 6, 2025 04:19
@sumitagrawl sumitagrawl added the om-pre-ratis-execution PRs related to https://issues.apache.org/jira/browse/HDDS-11897 label Jan 7, 2025
Copy link
Contributor

@kerneltime kerneltime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This refactoring looks good, but the overall changes planned need more explanation. Since we are still using TermIndex and serializing it to ByteString and using it for DoubleBuffer, how does it change once we control the object ID generation and write batches to the ratis log.

* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.ozone.om.ratis;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be located in org.apache.hadoop.ozone.om instead? Since the goal is to make this independent of Ratis in the long run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will follow below structure,

org.apache.hadoop.ozone.om.execution
- factory/. // factory to create Execution handler
- request/. // request handling
  - obs
    - OMKeyCreateExecutor.java
    - OMKeyCommitExecutor.java
  - fso
    - OMFileCreateExecutor.java
  ..
- OMGateway.java
- PreRatisExecutionEnabler.java
- flowcontrol/
  - ExecutionContext.java
  - PoolExecutor.java
  - PostExecutionContext.java
  - DbChangeRecorder.java
  - RequestContext.java
- IndexGenerator.java. // generate unique index

import org.apache.ratis.server.protocol.TermIndex;

/**
* context required for execution of request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* context required for execution of request.
* Context required for execution of a request.

Comment on lines 43 to 45
if (null == termIndex) {
return TermIndex.valueOf(-1, index);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to the constructor?

@sumitagrawl sumitagrawl merged commit f8394cf into apache:master Jan 8, 2025
42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
om-pre-ratis-execution PRs related to https://issues.apache.org/jira/browse/HDDS-11897
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants