-
-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting in-memory list from the database stream #428
Comments
@Radamonas found the way as a workaround for the issue related to ODBC drivers. It works with a client cursor, witch is the recommended way to avoid memory overload: |
It does work in provided example, the thing is that it drains both stream cursors to files. What makes me wonder is whether it is possible to have definitions of several streams in the same process without actually outputting them to a file, so that they could be used for combining streams, e.g. look-ups, etc. later on? |
yes, you can do what ever you want from what you issue from any operators, and that counts as well for As maybe there is something I didn't get in your question, give me a sample of the kind of code you would like to achieve, and I will adapt it to work with existing operators. |
What I was referring to with this question is actually this statement and example in the documentation:
The question is: how to you define the |
you just have to define your postStream exactly in the same way you defined authorStream, and combine them the way you want. Maybe this near real life example will help you to understand: |
Sorry to bother you again, but the examples deal with files, and I'm referring to the SQL Server data source and this note in documentation: "To make a lookup, extensions for Sql Server don't provide any operator out of the box. The work around How do you actually accomplish this workaround and perform in memory look up if you have two database streams? |
yes but in the same way you produce data from a file, you produce data from an in memory stream. Does the following example represent what you expected? var categoriesStream = contextStream.CrossApply("create in memory categories", ctx => new[] {
new { Code = "a", Label = "label A", Tax = 0.1 },
new { Code = "b", Label = "label B", Tax = 0.2 },
new { Code = "c", Label = "label C", Tax = 0.3 },
});
contextStream
.CrossApplyFolderFiles("list all required files", "*.zip", true).CrossApplyZipFiles("extract files from zip", "*.csv") // could be retrieved from database... anything...
.CrossApplyTextFile("parse file", FlatFileDefinition.Create(i => new Article
{
Name = i.ToColumn("name"),
Category = i.ToColumn("first name"),
Price = i.ToNumberColumn<double?>("price", ".")
}).IsColumnSeparated(','))
.Lookup("get category label", categoriesStream, l => l.Category, r => r.Code, (l, r) => new
{
Name = l.Name,
CategoryLabel = r.Label,
PriceWithTax = l.Price * r.Tax,
})
[...]; |
It is clearer now, thank you. The one thing I still can't understand is how to declare an in-memory stream from the SQL Database data, because this is, as I understand, currently the only way to combine two database streams. |
I still don't understand your point. In the previous example, FYI, here is the documentation that shows how to combine streams: https://paillave.github.io/Etl.Net/docs/recipes/linkStreams |
Yes, categoriesStream is in-memory, but it is initialized by using array initializer at compile time. My question is how do you populate an in-memory stream from a database table? Documentation says you cannot combine two database streams unless one of the streams is an in memory stream. Is there actually no way to combine to database streams? |
If you want the join to be done by the database engine, your only choice is to make the proper SQL query with the right join statement. But I still have the feeling this is not what you expect, so I'm a bit lost about what puzzles you. var categoriesStream = contextStream.CrossApplySqlServerQuery("get categories", o => o
.FromQuery("select * from dbo.ProductCategory")
.WithMapping(i => new
{
Code = i.ToColumn("Code"),
Label = i.ToColumn("Label"),
Tax = i.ToNumberColumn<decimal>("Tax")
}));
contextStream
.CrossApplyFolderFiles("list all required files", "*.zip", true).CrossApplyZipFiles("extract files from zip", "*.csv") // could be retrieved from database... anything...
.CrossApplyTextFile("parse file", FlatFileDefinition.Create(i => new Article
{
Name = i.ToColumn("name"),
Category = i.ToColumn("first name"),
Price = i.ToNumberColumn<double?>("price", ".")
}).IsColumnSeparated(','))
.Lookup("get category label", categoriesStream, l => l.Category, r => r.Code, (l, r) => new
{
Name = l.Name,
CategoryLabel = r.Label,
PriceWithTax = l.Price * r.Tax,
})
[...]; |
This sentence made everything clear,
Thank you, @paillave, and sorry for bombarding you with questions ;) |
leave it opened for documentation purpose till the answers are on the documentation website |
Hello,
is there a way to get an in memory list as an instantiated
IList<T>
from SQL Server stream and have the database cursor consumed/closed?First I thought extension method
ToList()
is for this purpose, but it appears it returns anISingleStream<List<T>>
If I get a list this way and I don't consume this stream by, say, outputting to a text file and then I try a query like this on the same ODBC connection:
I get error:
The text was updated successfully, but these errors were encountered: