使用okHttp不走代理问题
背景
某日使用okhttp设置代理并发送爬虫请求时,发现部分url请求没有走代理直接和目标url建立了连接,伪代码如下。初始化okhttpClient时设置了proxySelecter代理,但是调用okhttpClient.newCall请求时并没用调用proxySelecter.select函数获取代理,日志也没有打印。
public void call(String url) {ProxySelector proxySelector = new ProxySelector() {@Overridepublic List<Proxy> select(URI uri) {log.info("run into proxy");Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("127.0.0.1", 80));return Collections.singletonList(proxy);}@Overridepublic void connectFailed(URI uri, SocketAddress sa, IOException ioe) {return;}};OkHttpClient client = new OkHttpClient.Builder().proxySelector(proxySelector).build();okhttp3.Request request = new Request.Builder().url(url).build();client.newCall(request);}
okHttp & 代理
Android | 彻底理解 OkHttp 代理与路由
为什么没走代理
okhttp选择proxy时,现将传入的url传换为uri,如果uri的host为空,okhttp选择直连url,放弃走代理
okhttp3.internal.connection.RouteSelector
private fun resetNextProxy(url: HttpUrl, proxy: Proxy?) {fun selectProxies(): List<Proxy> {// If the user specifies a proxy, try that and only that.if (proxy != null) return listOf(proxy)// If the URI lacks a host (as in "http://</"), don't call the ProxySelector.val uri = url.toUri()//此处,如果host解析出来为null。放弃走设置的代理if (uri.host == null) return immutableListOf(Proxy.NO_PROXY)// Try each of the ProxySelector choices until one connection succeeds.val proxiesOrNull = address.proxySelector.select(uri)if (proxiesOrNull.isNullOrEmpty()) return immutableListOf(Proxy.NO_PROXY)return proxiesOrNull.toImmutableList()}eventListener.proxySelectStart(call, url)proxies = selectProxies()nextProxyIndex = 0eventListener.proxySelectEnd(call, url, proxies)}
val uri = url.toUri() 函数扒到底,实际获取hostName的执行代码如下。java.net.Uri包解uri时,如果uri的host不合法,则降级设置host为null。
java.net.uri类节选代码
private int parseAuthority(int start, int n)throws URISyntaxException{...if (serverChars) {// Might be (probably is) a server-based authority, so attempt// to parse it as such. If the attempt fails, try to treat it// as a registry-based authority.try {//此处解析hostName,不合法的话会扔出URISyntaxException异常q = parseServer(p, n);if (q < n)failExpecting("end of authority", q);authority = substring(p, n);} catch (URISyntaxException x) {// Undo results of failed parseuserInfo = null;//host被赋值为空指针host = null;port = -1;if (requireServerAuthority) {// If we're insisting upon a server-based authority,// then just re-throw the exceptionthrow x;} else {// ex = x;q = p;}}}...return n;}
参考:JDK(java.net.URL) 中的 一个 "bug" | 唐磊的个人博客