Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any plan for yarn support? #24

Open
Garlandal opened this issue Jan 22, 2019 · 6 comments
Open

Any plan for yarn support? #24

Garlandal opened this issue Jan 22, 2019 · 6 comments

Comments

@Garlandal
Copy link

hi, there is a yarn cluster and some flink jobs running on it, manually deploy jobs take a lot of steps and take too much effort. it seems that this project doesn't have yarn job deploy support or maybe i got the wrong way.
so, is there any plan for yarn support or someone tell me the right way to deploy flink job on yarn?

@nielsdenissen
Copy link
Contributor

Hi Garlandal, we have Flink running in a Kubernetes cluster and no experience with running it on Yarn ourselves. If you have a use case and would like to contribute you're of course very welcome to!

@Garlandal
Copy link
Author

@nielsdenissen
hhhhhh,maybe i need to learn more about the detail of the flink job running on yarn first.
anyway, thanks for the replying~

@fedexist
Copy link

fedexist commented Feb 27, 2019

Hi @Garlandal, since we're using YARN for our flink sessions, I've been thinking about how to use flink-deployer in our environment.

Since flink-deployer needs the Job Manager HTTP endpoint, a simple solution would be to use the amRPCAddress of your flink YARN container (it would be something like http:// yarn-node-manager:XXXXX, which can be retrieved via YARN REST API, without changing anything in the current implementation.
Still, it could be convenient having this functionality built-in, passing the YARN RMs and the application_id (or applicationType or application name) as additional command line arguments.

Still, if you're storing Flink savepoints on HDFS, I think there would be the need to implement an appropriate client following the afero.Fs interface. But I just started looking at flink-deployer source code, so there may be other ways.

@jrask
Copy link

jrask commented Dec 9, 2019

We are running on yarn and are considering flink-deployer usage.
I guess that the savepoint url is simply sent to flink jobmanager so it should not really matter if it is hdfs URL or not since (I guess??) that the flink-deployer never does anything with this url/directory?

So basically, all you need is the HTTP endpoint. I will try this and see if it works as I expect...

@Garlandal
Copy link
Author

Garlandal commented Dec 10, 2019

Sorry for the late reply, finally, i found that Docker + python script can meet my requirement, some details are as follows:

  1. a python script implement some flink command line command using subprocess call and argparse
  2. a Dockerfile based on flink:1.8-scala_2.11 and install some python dependencies above
  3. a docker-compose.yml file for different deploy mode such as yarn_job or yarn_session based on different flink config and different env if you need

finally, the deploy command will turn into this and can add more feature such as update and some others

docker-compose run yarn_job new test_flink_job

@jrask
Copy link

jrask commented Dec 19, 2019

We have tried this on our yarn setup and this can easily be done with curl and jq to get the job manager ur from resource manager. No really good reason to add this feature to flink-deployer as I see it. It might also perhaps be something that varies or changes between deployments?

FLINK_BASE_URL=$(curl http://{{ resource_manager }}:8088/ws/v1/cluster/apps?states=RUNNING | jq -r '.apps.app[] | select(.name == "Flink session cluster") | .amRPCAddress')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants