You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want train cnns on a big dataset via transfer learning using torch in R. Since my dataset is to big to be loaded all at once, I have to load each sample from the SSD in the dataloader. But loading one batch from my SSD takes about 5-10x the time as processing (forward pass, back prop, optimizing) it. Therefore asynchronous parallel data loading would be advisable.
As far as I understand torch, this can be done in the dataloader via the num_workers - parameter. But using that did not decrease the loading time of a batch in the trainingsloop, except from introducing a big overhead before the first batch is gathered (probably there the workers are created). Now I need advise, if this can be done in torch and if I implemented anything wrong.
Example:
library(torchvision)
library(torch)
dl<-torchvision::image_folder_dataset(
root="./data/processed/satalite_images/to_use",
loader=function(path){
# I have images of size 299x299 with 13 channels.
# optimizing this loading step yielded no significant improvement.
return(array(readRDS(path), dim=c(13,299,299))*1.0)
},
target_transform = function(x){a<-c(0.0,1.0)[x];dim(a)<-1;return(a)}
)
#Here I set num_workers to different numbers, but that did not change the loading time
dl2<-torch::dataloader(dl, batch_size=110L, shuffle = T, num_workers = 15L, pin_memory=T)
#just a random pretrained model for transfer learning
model_torch = torchvision::model_alexnet(pretrained = T)
model_torch$parameters |>
purrr::walk(function(param) param$requires_grad_(FALSE))
# replacing the last layer to my desired classifier
inFeat =model_torch$classifier$'6'$in_features
model_torch$classifier$'6' = nn_linear(inFeat, out_features = 1L)
# I have 13 input channels, therefore I replace the first conv layer with a equivialent one but with 13 input channels
conv1<-torch::nn_conv2d(in_channels=13L, out_channels=model_torch[[1]]$`0`$out_channels,
kernel_size =model_torch[[1]]$`0`$kernel_size ,
stride = model_torch[[1]]$`0`$stride,
padding =model_torch[[1]]$`0`$padding,
dilation = model_torch[[1]]$`0`$dilation, groups = model_torch[[1]]$`0`$groups, bias = TRUE)
model_torch[[1]]$`0`<-conv1
model_torch<-model_torch$to(device = "cuda")
opt = optim_adam(params = model_torch$parameters, lr = 0.01)
#trainings loop
for(e in 1:1){
losses = c()
#storing the time which the loop uses for computing and data loading
end<-Sys.time()
coro::loop(
for(batch in dl2){
start<-Sys.time()
#this is the time it takes to load a batch
print(start-end)
print("computing")
opt$zero_grad()
pred = model_torch(batch[[1]]$to(device="cuda"))
res=batch[[2]]$to(device = "cuda")
loss = nnf_binary_cross_entropy(input=torch_sigmoid(pred),target=res)
loss$backward()
opt$step()
losses = c(losses, loss$item())
end<-Sys.time()
#this is the time it takes to process a batch
print(end-start)
print("loading")
}
)
}
To my understanding the time it takes to load a batch should (after the first few batches) decrease significantly if I use parallel batch loading through num_workers compared to num_workers = 0.
But the printed time stays the same no matter the number of workers used.
I would be glad if anyone could help me!
The text was updated successfully, but these errors were encountered:
I want train cnns on a big dataset via transfer learning using torch in R. Since my dataset is to big to be loaded all at once, I have to load each sample from the SSD in the dataloader. But loading one batch from my SSD takes about 5-10x the time as processing (forward pass, back prop, optimizing) it. Therefore asynchronous parallel data loading would be advisable.
As far as I understand torch, this can be done in the dataloader via the num_workers - parameter. But using that did not decrease the loading time of a batch in the trainingsloop, except from introducing a big overhead before the first batch is gathered (probably there the workers are created). Now I need advise, if this can be done in torch and if I implemented anything wrong.
Example:
To my understanding the time it takes to load a batch should (after the first few batches) decrease significantly if I use parallel batch loading through num_workers compared to num_workers = 0.
But the printed time stays the same no matter the number of workers used.
I would be glad if anyone could help me!
The text was updated successfully, but these errors were encountered: