一个二郎演员小演示

a erlang actor little demo

我是Erlang菜鸟,写spider的代码很累:

-module(http).
-compile([export_all]).

init() ->
  ssl:start(),
  inets:start(),
  register(m, spawn(fun() -> loop() end)),
  register(fetch, spawn(fun() -> x() end)),
  ok.

start() ->
  L1 = [114689,114688,114691,114690], % detail page id

  lists:map(fun(Gid) ->
    io:format("~p ~n", [Gid]),
    fetch ! {go, Gid}
  end, L1),
  m ! done,
  done.

main(_) ->
  init(),
  start().

loop() ->
  io:fwrite("this is in loop!!"),
  receive
    {no_res, Gid} ->
      io:format("~p no res ! ~n", [Gid]),
      loop();
    {have_res, Gid} ->
      io:format("~p have res ! ~n", [Gid]),
      loop();
    done ->
      io:format("wowowow", [])
  end.

x() ->
  receive
    {go, Gid} ->
      http_post(Gid);
    _ ->
      ready
  end.

http_post(Gid) ->
  URL = "https://example.com", % url demo 
  Type = "application/json",
  ReqArr = ["[{\"id\": \"", integer_to_list(Gid), "\"}]"],
  ReqBody = string:join(ReqArr, ""),

  case httpc:request(post, {URL, [], Type, ReqBody}, [], []) of
    {ok, {_, _, ResBody}} ->
      if
        length(ResBody) =:= 0 ->
          io:format("Y: ~p ~n", [Gid]);
          m ! {no_res, Gid};
        true ->
          io:format("N: ~p ~n", [Gid])
          m ! {have_res, Gid}
      end;
    {error, Reason} ->
      io:format("error cause ~p~n", [Reason]);
    _ ->
      io:format("error cause ~n", [])
  end.

现在,当我执行代码时,进程会立即终止,日志:

我有两个问题:

  1. 我如何解决这个问题?
  2. 如果我在L1中有几万个id,怎么解决?产生几十个演员?如果是,你如何决定哪个演员receive哪个id?

1) 而不是将匿名函数包装在 loop() 周围:

register(m, spawn(fun() -> loop() end)),

你可以打电话给spawn/3:

register(m, spawn(?MODULE, loop, []) ),

这里也一样:

register(fetch, spawn(fun() -> x() end)),

没有。在脚本中调用 spawn/3 不起作用——除非您使用以下代码预编译脚本:

escript -c myscript.erl

2) 一个 escript 创建一个进程来执行您定义的 main/1 函数。您的 main/1 函数如下所示:

main(_) ->
  init(),
  start().

init() 函数不循环,因此它在它调用的所有函数 return 之后结束,即 ssl:start()inets:start()register() .你的 start() 函数也不循环,所以在 start() returns 之后,然后是 main() returns 并且因为执行 main() 的进程函数无事可做,结束。

3)

How I solve this problem ?

Http post 请求在计算机处理速度方面需要永恒,并且涉及等待,因此您可以通过同时执行多个 post 请求而不是执行来加速代码他们依次。在 erlang 中,同时执行事物的方式是产生额外的进程。在您的情况下,这意味着为每个 post 请求生成一个新进程。

您的主进程可以是一个无限循环,它处于接收等待消息的状态,如下所示:

main(_) ->
  init(),
  loop().

其中 init() 看起来像这样:

init() ->
  ssl:start(),
  inets:start(),
  register(loop, self()),
  ok.

然后你可以创建一个像 start() 这样的用户界面函数来生成 post 请求:

start() ->
  L1 = [114689,114688,114691,114690], % detail page id

  lists:foreach(fun(Gid) ->
    io:format("~p ~n", [Gid]),
    spawn(?MODULE, http_post, [Gid])
  end, L1).

---回复评论---

这是一个例子:

%First line cannot have erlang code.
main(_) ->
    init(),
    start().

init() ->
    ssl:start(),
    inets:start().

start() ->
    L1 = [1, 2, 3, 4],
    Self = self(),

    Pids = lists:map(fun(Gid) ->
        Pid = spawn(fun() -> http_post(Gid, Self) end),
        io:format("Spawned process with Gid=~w, Pid=~w~n", [Gid, Pid]),
        Pid
    end, L1),

    io:format("Pids = ~w~n", [Pids]),

    lists:foreach(
        fun(Pid) ->
            receive
                {no_res, {Gid, Pid} } ->
                    io:format("no response! (Gid=~w, Pid=~w)~n", [Gid, Pid]);
                {have_res, {Gid, Pid, Reply}} ->
                    io:format("got response: ~p~n(Gid=~w, Pid=~w)~n~n", 
                              [Reply, Gid, Pid]);
                {Pid, Gid, Error} ->
                    io:format("Error:~n(Gid=~w, Pid=~w)~n~p~n", [Gid, Pid, Error])
            end
        end, Pids).

http_post(Gid, Pid) ->
  URL = "http://localhost:8000/cgi-bin/read_json.py", % url demo 
  Type = "application/json",
  ReqArr = ["[{\"id\": \"", integer_to_list(Gid), "\"}]"],
  ReqBody = string:join(ReqArr, ""),

  case httpc:request(post, {URL, [], Type, ReqBody}, [], []) of
    {ok, {_, _, ResBody}} ->
      if
        length(ResBody) =:= 0 ->
          io:format("Y: ~p ~n", [Gid]),
          Reply = {no_res, {Gid, self()} },
          Pid ! Reply;
        true ->
          io:format("N: ~p ~n", [Gid]),
          Reply = {have_res, {Gid, self(), ResBody} },
          Pid ! Reply
      end;
    {error, _Reason}=Error ->
        Pid ! {Gid, self(), Error};
    Other ->
        Pid ! {Gid, self(), Other} 
  end.

If I have tens of thousands of id in L1 , how solve?

同理。十万个进程在erlang里算不上很多进程。