Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-17548: Switch all public Java APIs from File to Path #2907

Merged
merged 12 commits into from
Dec 19, 2024

Conversation

mlbiscoc
Copy link
Contributor

@mlbiscoc mlbiscoc commented Dec 13, 2024

https://issues.apache.org/jira/browse/SOLR-17548

Description

Move from File to the more modern java.nio.file.Path

Solution

Tried to find all of the public constructors/methods where File is being used or returned as a type and do it's conversion to Path instead. If the conversion was simple, also changed the File methods being used to its equivalent Path methods to stop using the toFile() call.

There were some places where it required a much larger rework which felt at of scope of this already large number of changes. For those I placed a // TODO SOLR-8282 move to PATH

Tests

Modified most uses of File to Path and many of the changes revolve around me removing the toPath().

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

Copy link
Contributor

@epugh epugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really excited about this refactoring! I made some comments, mostly around "hey, this is a nice chnage" or "hey, is this an opporutnity" versus "please change this" type comments.

One thing I am curious is how do we know all the conversions are done? I like the addition of the TODO to point out where more refactorings can be done. I guess I am wondering, once we have a lot of .toString() or .toPath(), how do we go back and remove the excess conversion loops? Also, i wonder if we need to add when this is done some forbiddenapi entries for file? I triggered the CI process... hopefully CI will run on all your other commits.

@@ -267,7 +267,7 @@ public void startMiniCluster(int nodeCount) {
cluster =
new MiniSolrCloudCluster.Builder(nodeCount, miniClusterBaseDir)
.formatZkServer(false)
.addConfig("conf", getFile("src/resources/configs/cloud-minimal/conf").toPath())
.addConfig("conf", getFile("src/resources/configs/cloud-minimal/conf"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love seeing the changed version is shorter and simpler than the original one!

public void setDataDir(File dataDir) {
this.dataDir = dataDir;
public void setDataDir(Path dataDir) {
this.dataDir = dataDir.toFile();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this an example of where you felt changing things got out of hand? Otherwise, seems like if dataDir was a Path, then no .toFile()??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this one is setting dataDir from an import existing in org.apache.zookeeper.server.quorum.QuorumPeerConfig which is outside of Solr. If you see toFile() it is most likely because it either got out of hand and I put TODO to come back later or it wasn't possible to use Path strictly

@@ -339,6 +339,8 @@ public String getDataHome(CoreDescriptor cd) throws IOException {

public void cleanupOldIndexDirectories(
final String dataDirPath, final String currentIndexDirPath, boolean afterCoreReload) {

// TODO SOLR-8282 move to PATH
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your stepwise logic is probably much better than a "lets change it all in one fell swoop!"

@@ -294,7 +294,7 @@ public void testDoFilesMode() throws IOException {
postTool.recursive = 0;
postTool.dryRun = true;
postTool.solrUpdateUrl = URI.create("http://localhost:8983/solr/fake/update");
File dir = getFile("exampledocs");
File dir = getFile("exampledocs").toFile();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we just do a toString, maybe we can simplify this line???

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matbe change postFiles to take in a list of paths?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels out of scope for now but agreed maybe we can simplify here. Unless you feel strongly about it, will add a TODO and I can do on a separate PR.

@@ -303,7 +303,7 @@ public void testDoFilesMode() throws IOException {
public void testDetectingIfRecursionPossibleInFilesMode() throws IOException {
PostTool postTool = new PostTool();
postTool.recursive = 1; // This is the default
File dir = getFile("exampledocs");
File dir = getFile("exampledocs").toFile();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise, maybe we just make strings and avoid going thoruh the file type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or stay as a Path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I actually had done a lot of this work already across the repo but there were so many tests using File and I tried to avoid the conversions or going through File, the changes ended up over 800+ lines... So holding those code changes on my Fork for now.

For now, I was planning on keeping the conversions of toFile() and toPath() and we can move out of all of these conversions as best as possible on another PR. Should be easy with a search. I think here I was trying to get as many of the APIs return types and parameters off of File to keep in scope so it was reviewable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like maybe an opportunity to change from using either Strings and Files to just path and avoid some conversions??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read comment above

@@ -157,7 +157,7 @@ private static MiniSolrCloudCluster startCluster(

// new SolrCloudTestCase.Builder(1, baseDir).addConfig("conf",
// getFile("configs/cloud-minimal/conf").toPath()).configure();
cluster.uploadConfigSet(getFile("configs/cloud-minimal/conf").toPath(), "conf");
cluster.uploadConfigSet(getFile("configs/cloud-minimal/conf"), "conf");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it!

@@ -44,7 +44,7 @@ public static void beforeClass() throws Exception {
initCore(
"solrconfig-script-updateprocessor.xml",
"schema.xml",
getFile("scripting/solr").getAbsolutePath());
getFile("scripting/solr").toAbsolutePath().toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to change initCore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reply as comment above.

@@ -41,7 +41,6 @@ public static void setupCluster() throws Exception {
.addConfig(
"conf",
getFile("solrj")
.toPath()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet!

@mlbiscoc
Copy link
Contributor Author

Really excited about this refactoring! I made some comments, mostly around "hey, this is a nice chnage" or "hey, is this an opporutnity" versus "please change this" type comments.

One thing I am curious is how do we know all the conversions are done? I like the addition of the TODO to point out where more refactorings can be done. I guess I am wondering, once we have a lot of .toString() or .toPath(), how do we go back and remove the excess conversion loops? Also, i wonder if we need to add when this is done some forbiddenapi entries for file? I triggered the CI process... hopefully CI will run on all your other commits.

Thanks for helping review @epugh! I found all imports of java.io.File and tried to remove it as much as possible. Mostly tried to fix using File if it was trival, otherwise you'll see the toFile() or toPath() This is a first pass through and looks like tests pass. Second pass would probably be to try and remove toFile()ortoPath()as well as creation of anynew File()` as much as possible. If we can make it a forbidden API, the build will tell us where all of them are being called to continue with clean up.

I found the File api was used so much in Tests, when I was doing conversions the number of lines changes became too much for this PR. I am holding those on my fork for now and hopefully I can maybe push that PR after this one.

@epugh
Copy link
Contributor

epugh commented Dec 14, 2024

Sounds like you have a strong plan for moving forward @mlbiscoc and that's awesome. Do holler when you are ready for review and commit.

@mlbiscoc
Copy link
Contributor Author

Sounds like you have a strong plan for moving forward @mlbiscoc and that's awesome. Do holler when you are ready for review and commit.

This should be ready for review unless you want me to bring in those other changes but was going for that in a separate PR

@epugh
Copy link
Contributor

epugh commented Dec 14, 2024

Cool... I'll leave it open for a few days to see what else crops up.. Ping me if we don't merge it by mid week!

@@ -78,7 +77,7 @@ protected LockFactory createLockFactory(String rawLockType) throws IOException {

@Override
public String normalize(String path) throws IOException {
return super.normalize(new File(path).getCanonicalPath());
return super.normalize(Path.of(path).toAbsolutePath().toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File.getCanonicalPath is not identical to Path.toAbsolutePath. It requires adding normalize().

Comment on lines 1524 to 1531
private boolean copyTmpTlogFiles2Tlog(Path tmpTlogDir) {
Path tlogDir =
FileSystems.getDefault().getPath(solrCore.getUpdateHandler().getUpdateLog().getTlogDir());
Path backupTlogDir =
FileSystems.getDefault()
.getPath(tlogDir.getParent().toAbsolutePath().toString(), tmpTlogDir.getName());
.getPath(
tlogDir.getParent().toAbsolutePath().toString(),
tmpTlogDir.getFileName().toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks more complicated than it needs to be. It's highly unusual for anyone to need to refer to FileSystems. Look at the implementation of Path.of(...) which proves we can just use Path.of here.

But hey, copyTmpTlogFiles2Tlog isn't used at all so let's just delete it!

log.info("File-based storage initialized to use dir: {}", storageDir);
}

@Override
public boolean exists(String storedResourceId) throws IOException {
return (new File(storageDir, storedResourceId)).exists();
return (Files.exists(Path.of(storageDir, storedResourceId)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a needless parenthesis here

@@ -108,7 +109,7 @@ public Lookup create(NamedList<?> params, SolrCore core) {

try {
return new AnalyzingInfixSuggester(
FSDirectory.open(new File(indexPath).toPath()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow that line of code was sad.

final String prefix = TLOG_NAME + '.';
String[] names = directory.list((dir, name) -> name.startsWith(prefix));
String[] names = directory.toFile().list((dir, name) -> name.startsWith(prefix));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't call toFile

Comment on lines 35 to 36
File testHome = createTempDir().toFile();
FileUtils.copyDirectory(getFile("analysis-extras/solr"), testHome);
FileUtils.copyDirectory(getFile("analysis-extras/solr").toFile(), testHome);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skip File conversion and use PathUtils.copyDirectory

Comment on lines 28 to 29
File testHome = createTempDir().toFile();
FileUtils.copyDirectory(getFile("analysis-extras/solr"), testHome);
FileUtils.copyDirectory(getFile("analysis-extras/solr").toFile(), testHome);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same as my last comment)

Comment on lines 31 to 32
File testHome = createTempDir().toFile();
FileUtils.copyDirectory(getFile("analysis-extras/solr"), testHome);
FileUtils.copyDirectory(getFile("analysis-extras/solr").toFile(), testHome);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same)

Comment on lines 58 to 59
File testHome = createTempDir().toFile();
FileUtils.copyDirectory(getFile("clustering/solr"), testHome);
FileUtils.copyDirectory(getFile("clustering/solr").toFile(), testHome);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

upload.setUploadFile(tmpFile, "application/zip");
upload.setUploadFile(tmpFile.toPath(), "application/zip");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simply declare tmpFile as Path without conversion and maybe no further changes

@mlbiscoc
Copy link
Contributor Author

@dsmiley Addressed your comments above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@@ -55,16 +54,16 @@ public static void copyFile(File src, File destination) throws IOException {
* @param fullFile the File to be synced to disk
* @throws IOException if the file could not be synced
*/
public static void sync(File fullFile) throws IOException {
if (fullFile == null || !fullFile.exists())
public static void sync(Path fullFile) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now remove the method

@mlbiscoc
Copy link
Contributor Author

now remove the method

Oh duh... Removed!

@epugh epugh merged commit ae93adc into apache:main Dec 19, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment