Implement Kerberos authentication support for Hive Catalog#766
Implement Kerberos authentication support for Hive Catalog#766yothinix wants to merge 3 commits intoapache:mainfrom
Conversation
ea0ce8f to
259978e
Compare
259978e to
d17b7ce
Compare
|
Hi @Fokko I added new change as commented, could you help review it again Also, As I tested more on HMS behind Kerberize we found out that it's required the thrift client to be re-create every time we open a new connection, So that the implementation was change a bit to handle this case. The mentioned change is in this commit a7b8b1c |
|
LGTM+1 I had something very similar to this patched locally. It would be good to get merged asap |
|
Hi @yothinix do you mind rebasing with main? I think this is very close to being merged, I can work with you to get it through. |
a7b8b1c to
322d143
Compare
|
Hi @kevinjqliu Thank you for look into this PR, I just rebased the PR branch to latest main branch as requested. |
kevinjqliu
left a comment
There was a problem hiding this comment.
added a few minor comments
| When using Hive 2.x, make sure to set the compatibility flag: | ||
|
|
||
| ```yaml | ||
| catalog: | ||
| default: | ||
| ... | ||
| hive.hive2-compatible: true | ||
| ``` |
There was a problem hiding this comment.
nit: leave this as is, since this is only required for hive2.
and maybe add similar block for hive.kerberos-authorization since that's also optional
| HIVE2_COMPATIBLE = "hive.hive2-compatible" | ||
| HIVE2_COMPATIBLE_DEFAULT = False | ||
|
|
||
| HIVE_KERBEROS_AUTH = "hive.kerberos-authorization" |
There was a problem hiding this comment.
nit: authentication is different than authorization
| HIVE_KERBEROS_AUTH = "hive.kerberos-authorization" | |
| HIVE_KERBEROS_AUTH = "hive.kerberos-authentication" |
Kerberos is authentication https://docs.cloudera.com/cdw-runtime/1.5.1/securing-hive/topics/hive_remote_data_access.html
| if not self._kerberos_auth: | ||
| self._transport = TTransport.TBufferedTransport(socket) | ||
| else: | ||
| self._transport = TTransport.TSaslClientTransport(socket, host=url_parts.hostname, service="hive") |
There was a problem hiding this comment.
is "hive" a static value here or should it be configurable?
| if not self._kerberos_auth: | ||
| self._transport.open() | ||
| else: | ||
| self._init_thrift_client() |
There was a problem hiding this comment.
can this be called multiple times? The init function already calls this
There was a problem hiding this comment.
I think we can remove it here, it is also called on line 149
|
It would be really great if this could make it into 0.8. :) |
|
hi @yothinix thanks for the contribution. Would love like to address the comment above? Would be great to have this as part of the 0.9.0 release |
Took #766 and addressed the comments, to make sure that it gets into 0.9.0 --------- Co-authored-by: Yothin Muangsommuk <m@yothinix.com>
Change proposed
hive.kerberos-authorization=trueResolves #135