HTTPS hostname wrong issue

Sometimes, we work on interesting problems. Recently, My team was developing our service to service auth architecture at Financial Engines. We completed initial development and checked in changes in our legacy service which was invoking one of the microservices behind our apigateway.

Post deployment, I tested the changes in our development environment and everything appeared okay. However, we started receiving below errors in our test environment

Caused by: java.io.IOException: HTTPS hostname wrong:  should be <api.feitest.com>
at sun.net.www.protocol.https.HttpsClient.checkURLSpoofing(HttpsClient.java:649)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:573)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at com.sun.net.ssl.internal.www.protocol.https.HttpsURLConnectionOldImpl.connect(HttpsURLConnectionOldImpl.java:142)
at org.springframework.http.client.SimpleBufferingClientHttpRequest.executeInternal(SimpleBufferingClientHttpRequest.java:76)
at org.springframework.http.client.AbstractBufferingClientHttpRequest.executeInternal(AbstractBufferingClientHttpRequest.java:48)
at org.springframework.http.client.AbstractClientHttpRequest.execute(AbstractClientHttpRequest.java:49)
at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:519)

I started looking into the issue and from the first looks, it appeared that SSL handshake was failing since server’s certificate CN name was not matching with the host we were trying to connect to.

It was interesting to see that we never faced this issue in other environments including our local environment which has pretty much same configuration as of dev, test environments except few things!

As per best practices, all environments should have exact same configuration but this being a legacy service we have some inconsistencies in place.

I also noticed that this issue was intermittent and server bounce sometimes appeared to resolve the issue. However, other times, it caused the issue. So being a intermittent issue, best things we could have done was to compile all the information which might give us some leads.

Initial Analysis


back to top

  • In short, issue appears intermittently. See below data for more information.
    • 79 failure events from 12/4/16 12:00:00.000 AM to 12/8/16 9:43:59.000 AM
    • 31 success events (12/4/16 12:00:00.000 AM to 12/8/16 9:45:08.000 AM)
  • When issue appears, SSL handshake failed as it was evident from exception.
  • Looking at the logs in past few days, I saw that this issue is happening in all development and test environments intermittently but not consistently.
  • We have other internal, external clients connecting to same endpoint and they never faced the issue.
  • We were using Spring Template library to invoke other Microservice. As you might know, Spring Rest library provides means for synchronous client side HTTP access. It handles HTTP connections and helps in writing client code in much more cleaner way.

In nutshell, it was pretty clear that the issue was very much related to legacy service client side code. So I just looked at the versions we had in our dev,test environments. Production environment was to be upgraded to JRE8 in couple of weeks.

  • Dev / Test
    • Tomcat 6.0.45
    • JRE 1.8.0_101
  • Production
    • Tomcat 6.x
    • JRE 1.7.x

When HTTPS hostname issue appears?


back to top

Java by default verifies that the certificate CN (Common Name) is the same as hostname in the URL. If the CN in the certificate is not the same as the host name, your web service client fails with the following exception: java.io.IOException: HTTPS hostname wrong: should be hostname as in the certificates.

Quick search on google indicated following scenarios when this issue could appear

  • As per this stackoverflow link It appeared that certain misconfigured servers send an “Unrecognized Name” warning in the SSL handshake which is ignored by most clients except for Java which might cause this.
  • Other scenario was that it could happen if you are using non-CA issued cert.
  • Another Stackoverflow link also suggested that this issue might also come when you have some code for the separate JSSE that shipped prior to JDK 1.4. Delete the java.protocol.handler.pkgs setting, delete anything that adds the JSSE provider, delete anything that refers to com.sun.net.ssl.anything. If you have any other relices of the separate JSSE installation delete them too. The jre/lib/jsse.jar should be dated no earlier than December 2004.

However, I validated that first 2 didn’t apply in our case.

Initial troubleshooting


back to top

I could have taken many directions from here:

  • I tested the https endpoint via openssl client and browser and SSL handshake was perfectly fine.
  • I was not able to reproduce issue programmatically as well even after having exact same versions for jdk/web server locally.
  • I tried running the api under load just to avoid heavy load theory with 500, 1 K, 10 K invocations but we were not able to replicate as expected.
  • We enabled ssl debugging on the server by adding -Djavax.net.debug=ssl,handshake property to get additional information.
    • As per ssl debug logs, we noticed that server_name extension is not getting sent by the client.
    • See the screenshot below. Right side shows failure scenario and its clear that extension_server name header is not getting sent in failure scenario and due to that https handshake fails.

    ssl_debug_logs_comparison

    • So to fix this issue, we have to find out why sni headers are not getting sent by default?
  • As a company, we were in transition to move to JRE8. We were following a phase wise approach where we were upgrading the run time to JRE8 followed by JDK. Our techops team upgraded one of the servers in production to JRE8 around same time. Luckily, We had similar invocation and alerting in place and we started seeing this issue. Once, we rolled back the application to run on JRE7, everything worked fine. Technically, it was our first clue that there is something in jre8 which is causing this.

Overall, I had to answer one question, why sni headers are getting sent in JRE8 environments intermittently.

SNI stands for Server Name Indication and is an extension of the TLS protocol. It indicates which hostname is being contacted by the client at the beginning of the ‘handshake’-process. This allows a server to connect multiple SSL Certificates to one IP address and present relevant certificate to client based on SNI header.

Why SNI header is getting sent in JRE8 intermittently?


back to top

So I wrote down couple of questions which i wanted to get answers:

Does SNI implementation changed between JRE7 and JRE8?


back to top

  • The JDK comes standard with automatically installed and configured providers such as “SUN” and “SunJCE”. Oracle’s implementation of Java SE includes a JSSE provider named SunJSSE, which comes preinstalled and preregistered with the JCA.
  • SNI is handled by JSSE implementation library which comes with JRE.
  • SunJSSE has enabled SNI extension for client applications by default in JDK 7.
  • It appears that nothing has changed as per java release documents.

is it classpath issue?


back to top

  • As per one of stackoverflow link, One possibility could have been that jsse 1.0.x might be getting loaded which didn’t support SNI.
  • So first task was to find out which JSSE implementation is getting loaded.
  • I printed all security providers list and their versions and noticed that SunJSSE version was 1.8. SunJSSE 1.8 is the latest version which comes along with JRE8 and hence classpath theory seems to fall apart.

is it due to Java 8144567


back to top

  • As per the bug 8144567 description, SNI does not work with HTTPSUrlConnection if you have a custom HostnameVerifier in the code
  • Quickly checking in the code confirmed that we are definitely adding custom host name verifier although it was not applied globally.
  • I was able to reproduce the bug locally in sample application. I still saw some ray of hope if something is causing custom hostname verifier to be applied globally.
  • I pushed in code to print custom hostname verifier and it does print default custom hostname verifier indicating that it has not been overridden. In same checkin, I also added fix suggested by the bug by adding custom SSL socket factory like - HttpsURLConnection.setDefaultSSLSocketFactory(new FeCustomSSLSocketFactory()); and it did not resolve the issue.
  • Hence, it proved that it’s not the outcome of above-mentioned bug.
  • Just FYI, The bug was Resolved via 8159569 and will be released in 8u122 release which will be resolved to public in January 2017. Refer here to look at the changeset.

is it due to Rest Template library incompatibility with JRE8?

  • Although chances of this were less but I still wanted to get to root of it.
  • One theory is if spring code is explicitly using sun.net packages which might be causing java library to search HTTPS url handler in older url package
  • I looked at the Rest Template library source code and it appeared that Spring is just invoking underlying url connection api to get handle to Https url implementation.
  • Switching the implementation to Jersey Client instead of Spring Rest template did not help either but proved clearly that issue is at the JRE level.

Understanding Java Url Handler Implementation

So It appeared that I have to go one level down to Java’s URL implementation. Next step was to print what URL handler we are getting for our success versus failure scenarios.

  • I added custom clientHttpRequestFactory in RestTemplate so that we can get the handle to HttpsUrlConnection objects manually. After adding custom clientHttpRequestFactory and further logging/debugging I noticed that application is getting 2 instances of HTTPSURLConnection
  • I noticed that whenever request fails, we were getting HttpsURLConnectionOldImpl

     20161216 09:26:19.958 INF: tst120a1iwr20ij0hl|do-tst-10|46857|http-8080-10|domain.auth.ssl.FeCustomClientHttpRequestFactory: [INFO ] domain.auth.ssl.FeCustomClientHttpRequestFactory - CustomSSLSocketFactory - Set connection specific properties for connection=class com.sun.net.ssl.internal.www.protocol.https.HttpsURLConnectionOldImpl
    
  • Whenever request succeeds, we are getting HttpsURLConnectionImpl.

      20161215 15:55:24.081 INF: tst120b1iwpl6au02e|do-tst-10|191|http-8080-16|domain.auth.ssl.FeCustomClientHttpRequestFactory: [INFO ] domain.auth.ssl.FeCustomClientHttpRequestFactory - CustomSSLSocketFactory - Set connection specific properties for connection=class sun.net.www.protocol.https.HttpsURLConnectionImpl
    
  • Probably, this was biggest hint. So main question was when java returns HttpsURLConnectionImpl and when does it return HttpsURLConnectionOldImpl? and why its not consistent?

When does Java returns HttpsURLConnectionImpl and when does it return HttpsURLConnectionOldImpl?

The following is the default hierarchy, and is found by default when a “https” URL is used.

 java.lang.Object -> 
java.net.URLConnection -> 
java.net.HttpURLConnection -> 
javax.net.ssl.HttpsURLConnection -> 
sun.net.www.protocol.https.HttpsURLConnectionImpl - Sun impl

The following is the hierarchy used for backwards compatibility with JSSE 1.0.x.

java.lang.Object -> 
java.net.URLConnection -> 
java.net.HttpURLConnection -> 
com.sun.net.ssl.HttpsURLConnection ->
com.sun.net.ssl.internal.www.protocol.https.HttpsURLConnectionOldImpl - impl

In order to get an instance of com.sun.net.ssl.HttpsURLConnection, you have to override the URL search path via java.protocol.handler.pkgs property. So it appeared that somewhere this property was getting overridden causing the issue.

I noticed in logs that whenever server started java.protocol.handler.pkgs was null and whenever request failed, this property was getting set to com.sun.net.ssl.internal.www.protocol. It also explains why we were getting old HttpsURLConnection implementation which does not appear to support SNI.

java.protocol.handler.pkgs property sets your URL resolution to override the normal javax.net.ssl.HttpsURLConnecction and points at the old HttpsURLConnection implementation, which is an instance of com.sun.net.ssl.HttpsURLConnection.

So next logical steps were validating the hypothesis locally and find the root cause.

Reproduce the problem locally


back to top

In my local project, I set this property and tried to connect to https endpoint (which required SNI) and was able to reproduce the issue when run against JRE8. Next thing was to run the same program against JRE7 and i noticed that ssl handshake and request was successful.

This proved the actual behaviour we were facing where everything worked fine in JRE7 but failed intermittently in JRE8.

Find the culprit


back to top

I did quick research in the project but could not find the location where we were setting the property. So it appeared that this could be functionality added by tomcat, JRE or 3rd party jars.

I followed below steps on high level

  • extracted all classes from all 3rd party jars
  • ran the jd-gui java decompiler
  • saved all of that source to single directory
  • ran a find command to find the reference of this property

     find . -type f -name '*java'  -exec grep -l  java.protocol.handler.pkgs {} \;
    
  • Below were list of files which I found

     ./com/sun/xml/messaging/saaj/client/p2p/HttpSOAPConnection.java
     ./org/apache/axis/client/Call$1.java
     ./org/apache/axis/client/Call.java
     ./org/apache/axis/wsdl/toJava/JavaServiceImplWriter.java
     ./org/apache/commons/ssl/Java13.java
     ./org/codehaus/classworlds/uberjar/boot/Bootstrapper.java
     ./weblogic/utils/zip/Handler.java
    
  • I validated via tomcat logs that out of above only HttpSOAPConnection and Java13 was getting loaded in JVM.
  • Finally, It appeared that HttpSOAPConnection is the culprit which sets the property if it’s null. Sample code below.

      static
        {
          if (isIBMVM)
          {
            SSL_PKG = "com.ibm.net.ssl.internal.www.protocol";
            SSL_PROVIDER = "com.ibm.net.ssl.internal.ssl.Provider";
          }
          else
          {
            SSL_PKG = "com.sun.net.ssl.internal.www.protocol";
            SSL_PROVIDER = "com.sun.net.ssl.internal.ssl.Provider";
          }
        }
           
        private void initHttps()
        {
          String pkgs = System.getProperty("java.protocol.handler.pkgs");
          log.log(Level.FINE, "SAAJ0053.p2p.providers", new String[] { pkgs });
          if ((pkgs == null) || (pkgs.indexOf(SSL_PKG) < 0))
          {
            if (pkgs == null) {
              pkgs = SSL_PKG;
            } else {
              pkgs = pkgs + "|" + SSL_PKG;
            }
            System.setProperty("java.protocol.handler.pkgs", pkgs);
           ...
          }
        }
    

Solution / Final Comments


back to top

Interim solution, we had in place was to set the property java.protocol.handler.pkgs=sun.net.www.protocol so that the culprit class does not overwrite the property. It also allows url handler to handover the newer implementation of HttpsUrlConnection which supports SNI.

Final solution would be to follow up with Oracle and understand what could have been done better to not break other clients.

Special thanks to many folks at Financial Engines for being sounding board and in helping to take this issue to closer!

References


# Reference
1 http://stackoverflow.com/questions/16990799/https-hostname-wrong-should-be-xxx-xxx-xxx-xxx-checkurlspoofing-what-is-the
2 http://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html#Debug
3 http://stackoverflow.com/questions/23517139/java-lang-classcastexception-com-sun-net-ssl-internal-www-protocol-https-httpsu
4 http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8144566
5 http://stackoverflow.com/questions/35366763/in-java-8-can-httpsurlconnection-be-made-to-send-server-name-indication-sni

Version History


Date Description
2016-12-24    Initial Version
2017-1-28    Updated the blog with correct handler.pkgs value