Skip to content

Commit

Permalink
fix: SAX::ParserContext keeps a reference to the input (#3395)
Browse files Browse the repository at this point in the history
**What problem is this PR intended to solve?**

I saw a failure in CI at
https://github.com/sparklemotion/nokogiri/actions/runs/12529532790/job/34945362057?pr=3393

```
  1) Error:
Nokogiri::HTML4::SAX::ParserContext::constructor::encoding::.io#test_0001_supports passing encoding name:
NotImplementedError: method `read' called on terminated object (0x0000005520809790)
    test/html4/sax/test_parser_context.rb:60:in `parse_with'
    test/html4/sax/test_parser_context.rb:60:in `block (5 levels) in <module:SAX>'
```

So now the class keeps a reference to the input object to prevent it
from being GCed before we parse it.


**Have you included adequate test coverage?**

Yes.


**Does this change affect the behavior of either the C or the Java
implementations?**

N/A
  • Loading branch information
flavorjones authored Dec 29, 2024
2 parents 1d83082 + f2a9275 commit 82ffa0c
Show file tree
Hide file tree
Showing 2 changed files with 32 additions and 2 deletions.
10 changes: 8 additions & 2 deletions ext/nokogiri/xml_sax_parser_context.c
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,10 @@ noko_xml_sax_parser_context_s_native_io(VALUE rb_class, VALUE rb_io, VALUE rb_en
c_context->sax = NULL;
}

return noko_xml_sax_parser_context_wrap(rb_class, c_context);
VALUE rb_context = noko_xml_sax_parser_context_wrap(rb_class, c_context);
rb_iv_set(rb_context, "@input", rb_io);

return rb_context;
}

/* :nodoc: */
Expand Down Expand Up @@ -154,7 +157,10 @@ noko_xml_sax_parser_context_s_native_memory(VALUE rb_class, VALUE rb_input, VALU
c_context->sax = NULL;
}

return noko_xml_sax_parser_context_wrap(rb_class, c_context);
VALUE rb_context = noko_xml_sax_parser_context_wrap(rb_class, c_context);
rb_iv_set(rb_context, "@input", rb_input);

return rb_context;
}

/*
Expand Down
24 changes: 24 additions & 0 deletions test/test_memory_usage.rb
Original file line number Diff line number Diff line change
Expand Up @@ -313,5 +313,29 @@ def start_element(name, attrs = [])
# Expected error. This comment makes rubocop happy.
end
end

it "XML::SAX::ParserContext.io holds a reference to IO input" do
content = File.read(XML_ATOM_FILE)

memwatch(__method__) do
pc = Nokogiri::XML::SAX::ParserContext.io(StringIO.new(content), "ISO-8859-1")
parser = Nokogiri::XML::SAX::Parser.new(Nokogiri::SAX::TestCase::Doc.new)
GC.stress
pc.parse_with(parser)

assert_equal(472, parser.document.data.length)
end
end

it "XML::SAX::ParserContext.memory holds a reference to string input" do
memwatch(__method__) do
pc = Nokogiri::XML::SAX::ParserContext.memory(File.read(XML_ATOM_FILE), "ISO-8859-1")
parser = Nokogiri::XML::SAX::Parser.new(Nokogiri::SAX::TestCase::Doc.new)
GC.stress
pc.parse_with(parser)

assert_equal(472, parser.document.data.length)
end
end
end if ENV["NOKOGIRI_MEMORY_SUITE"] && Nokogiri.uses_libxml?
end

0 comments on commit 82ffa0c

Please sign in to comment.