- 
                Notifications
    
You must be signed in to change notification settings  - Fork 666
 
Async HTTP #3464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async HTTP #3464
Conversation
6380a1c    to
    8a59aba      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of making some of the HTTP stuff Async, though I wonder what the knock on effect will be on end users as I know some users in the past, myself included, have used Jena's HTTP helpers directly, so the HttpLib changes in particular may have some unexpected side effects for downstream users.
There seems to be some leftover debugging code around exception handling that looks like it needs tidying up
        
          
                jena-arq/src/main/java/org/apache/jena/update/UpdateProcessor.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      
          
 The main change should be only that the original sync methods now internally wait for the completion of a CompletableFuture. My feeling is that maintaining an additional non-async implementation based on the sync java.net.http API might eventually cause issues as it duplicates code that may diverge. The original test cases pass that way, but I am not sure how well tested this part is. Perhaps your previous use of the API could be added as (integration) tests?  | 
    
          
 Totally agree, and no problems with this general approach 
 That was many years ago now and against a much older Jena version, the code in question has bit-rotted horribly. These days I use other APIs for HTTP stuff rather than leaning on Jena's APIs. This might be a non-issue, just something to highlight and make sure we communicate to users in the release this eventually lands in.  | 
    
| 
           It is already async! The JDK implementation of  General use of HTTP/2 and HTTP/3 makes it necessary because they both multiplex the underlying network connection.  | 
    
017f631    to
    791fa78      
    Compare
  
    
          
 Right but that's the JDK internals. For clarification: Jena's  I removed the breadcrumb exceptions (the clean way to retain the breadcrumbs seems to be wrapping with custom   | 
    
91e2943    to
    30b70dd      
    Compare
  
    | } | ||
| // Note: Can't reuse AsyncHttpRDF.handleRuntimeException because of this HttpException. | ||
| throw new HttpException(httpRequest.method()+" "+httpRequest.uri().toString(), cause); | ||
| } else if (cause instanceof RuntimeException re) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it is a change to exception handling in the sync case. (I hope I read the code correctly.) Does this matter?
AsyncHttpRDF.getOrElseThrow passes up runtime exceptions untouched. That means the app gets the exception from the async thread and a stacktrace does not indicate where in the application code their synchronous call was made.
HttpClientImpl.send builds an exception with the other threads exception as the cause. The app gets an exception with the app stack. When printed, the stacktrace identifies the app code that makes the sync call.
The HttpClientImpl.send exception rebuild is a bit yuk - it has to enumerate the possible classes.
We could have a HttpException, even HttpAsyncException < HttpException, as a carrier to the app call stack making it robust against any future exception introduced by the JDK library. The change would be here, although in AsyncHttpRDF.getOrElseThrow is a possibility as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That means the app gets the exception from the async thread and a stacktrace does not indicate where in the application code their synchronous call was made.
Yes, that's the issue.
I think your suggestion to wrap an exception with the already existing unchecked HttpException is the clean way forward. Consolidating this exception handling in AsyncHttpRDF.getOrElseThrow seems reasonable.
A variant of getOrElseThrow with an additional HttpRequest argument could be used to add the request URIs to the exceptions (which is what happens in HttpLib):
throw new HttpException(httpRequest.method()+" "+httpRequest.uri().toString(), cause);There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code was revised and is ready for review.
2638547    to
    bf0cf8d      
    Compare
  
            
          
                jena-arq/src/main/java/org/apache/jena/sparql/exec/http/QueryExecHTTP.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      bf0cf8d    to
    45b75b8      
    Compare
  
    | 
           I have completed my revisions: 
  | 
    
45b75b8    to
    1f466ee      
    Compare
  
    | this.retainedConnectionView = new ProxyInputStream(in) { | ||
| @Override | ||
| protected void beforeRead(int n) throws IOException { | ||
| checkNotAborted(); | ||
| super.beforeRead(n); | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as a note: This is still not the nicest solution because it pulls the rug from under the parsers' feet - but the original solution also just closed the physical HTTP input stream.
Now abort() just sets a flag (instant action) that causes ProxyInputStream's next read to fail - whereas only close() closes the physical stream (may take time to consume remaining data).
If really needed, further revisions could go into future PRs.
          
 As long as someone ™️ does an exhaustive close to the HTTP connection.  | 
    
| 
           One conflict otherwise looks good.  | 
    
1f466ee    to
    41d7165      
    Compare
  
    | 
           Conflict resolved.  | 
    
41d7165    to
    f66ace2      
    Compare
  
    
GitHub issue resolved #3471
This is the async HTTP part factored out from #3184 .
Pull request Description:
QueryExecHTTP.abort()to immediately return even if the server has not yet responded to the connection request.Service.javagetUpdateStringandgetUpdateRequestmethods to UpdateProcessor so that the underlying request may be exposed.[ ] Tests are included(no additional tests added)By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the Contributor's Agreement.
See the Apache Jena "Contributing" guide.