豪翔天下

Change My World by Program


  • 首页

  • 关于

  • 归档

  • menu.favorites

  • menu.wiki
豪翔天下

MySQL之设计方法

发表于 2015-12-10 | 分类于 编程之路 |

关于架构

  • 一般把读取的请求放在缓存(Redis),而更新请求放在数据库

关于索引

  • 数据查询只能用到一个索引,不过这个只是说的是单表查询,联表查询实际上每个表都可以用到其独立的索引
  • 有些时候索引并不会用到,比如
    where key like ‘keyword\%’:这里可以用到key索引
    where key like ‘\%keyword\%’:这里用不到key的索引

  • 适当建立复合索引:
    前几天看了“caoz的梦呓”的文章《如何应对并发(1) - 关于数据索引》,理解了建立复合索引所需要考虑的一些东西,顺序不同效率也有很大的不同。
    ‘SELECT * FROM user where area = ‘$area’ order by lastlogin desc limit 30;’
    如果只把area当做索引,那么数据库会把符合这个area的所有结果都拿出来,然后按照lastlogin来进行排序;
    如果只把lastlogin做为索引,那么数据库会从最后一条开始往前遍历,每条都会对比area,直到数出30条
    如果lastlogin+area建立符合,和单独lastlogin索引是一样的
    如果area+lastlogin,把两个字段拼接然后排好序后,看这条SQL在这个数列中查询的提现,所命中的完全是连续的30条,仅仅遍历30条索引即可
    我最先以为简历复合索引也会是先查找出area,再拿出来排序哟,但其实索引都是预先排好了的,这里就相当于先按照area排序,area相同的再按照lastlog
    in进行排序,这样,只要找到area,然后取前面30条就可以了,就像电话簿一样,先找姓氏,姓氏相同的也会按照名排好序的。

豪翔天下

玩转树莓派2

发表于 2015-12-08 | 分类于 就是爱玩 |

想在家里做NAS、DNS等私有云服务,但是无奈家里淘汰下来的电脑已无力承担如此重任。没办法了,就只能试试树莓派。不试不知道,一试吓一跳,完全就是一手掌大小的
电脑,听说desktop版本还能使用word等软件,虽然只有1GB内存,但是200多块(淘宝店)就能买到这个东西,那是非常值了。当然,作为一个技术爱好者,别
人是完全无法体会这种快乐的。要是其功耗再低点或者能采用其它的供电方式(比如无线供电、电池供电),感觉完全能颠覆智能市场。

制作启动镜像

镜像下载:https://www.raspberrypi.org/downloads/,我下载的是RASPBIAN分支,因为其是官方提供且基于Debian,和Ubuntu操作一样.
Mac环境:

1
2
3
4
5
6
$ df # 查看当前已经挂载的卷,一般sd卡在最后,Filesystem是/dev/disk2s1,Mounted on /Volumes/No Name,可以在Finder里面将sd卡的名字改为Pi(我那个默认是No Name)
$ diskutil unmount /dev/disk2s1 #将sd卡卸载
Volume Pi on disk2s1 unmounted
$ diskutil list # 查看是否有sd卡设备
$ dd bs=4m if=pi.img of=/dev/rdisk2 #将镜像文件pi.img写入sd卡,需要注意这条命令使用的是rdisk2,这是原始字符设备
$ diskutil unmountDisk /dev/disk2 # 再卸载sd卡,此时可以拔出来插入树莓派的sd卡槽了

启动操作系统

收到货的那天,发现其有一个DC接口,还以为是通过DC接口供电,出门走了一圈都没发现有卖这货的,于是回家,自习已看,发现可以用Android的电源为期供电的,
那接口名字忘了。和网上建议的一样,我采用的是5V 2A的供电设备(其实是直接插到小米插线板上的)
然后,我又发现,我家里没多的网线,那怎么办,我装的不是desktop版本,没有网线就不能SSH进去。不过还好,它支持HDMI,于是我把它功过HDMI连接上了
家里40英寸的电视,(HDMI高清显示,真他妈爽)就像这样,还通过USB插了外置键盘。

默认是通电自动启动的,所以插上电就会进入系统了,默认用户名pi,默认密码是raspberry,接着就做一些基本的配置,通过sudo raspi-config来运行设置工具:

  • 第一项将sd卡的剩余空间全部用来使用
  • 然后修改Internationalisaton Options里面的时区及默认字符编码zh_CN GB2312/zh_CN.UTF-8 UTF-8
  • 接着修改源,这个国度没办法的事
1
2
3
4
5
6
7
8
$ sudo nano /etc/apt/sources.list.d/raspi.list
修改为如下:
deb http://mirrors.ustc.edu.cn/archive.raspberrypi.org/debian/ jessie main
$ sudo nano /etc/apt/sources.list
修改为如下:
deb http://mirrors.ustc.edu.cn/raspbian/raspbian/ jessie main non-free contrib
deb-src http://mirrors.ustc.edu.cn/raspbian/raspbian/ jessie main non-free contrib
  • 最后,安装必要的软件

    1
    2
    3
    4
    sudo apt-get update && sudo apt-get upgrade
    sudo apt-get install vim tree ttf-wqy-microhei git
    # 根据python.md安装python3
  • WIFI设置

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    当然,我不可能一直用电视作显示器吧,这时候我买的无线设备就有用场了,直接通过USB插到树莓派上,然后设置wifi
    $ ifconfig # 可以看到wlan0,表示已经识别无线网卡
    $ sudo vim /etc/network/interfaces添加或修改关于wlan0的配置
    auto wlan0
    allow-hotplug wlan0
    iface wlan0 inet dhcp
    wpa-ssid WIFI名称
    wpa-psk WIFI密码
    # 然后通过如下命令重启网卡
    sudo ifdown wlan0 && sudo ifup wlan0
  • 搭建ownCloud私有云

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    作为私有云方案,我选择的ownCloud,而不是Samba,因为Samba功能仅仅算是ftp的共享,而不是一个私有云方案,当然ownCloud也有为人诟病的地方,比如内存占用高(树莓派2上占用100多MB),另一个是因为它本身是基于Apache的,树莓派内存总共就1G,我可不想既有Apache又有Nginx,所以直接用的是Nginx+php5-fpm的方案,不过这样子,配置过程就有点麻烦了。
    # 首先,安装基本服务
    sudo apt-get install php5-common php5-cli php5-fpm
    sudo apt-get install nginx
    sudo apt-get install mysql-server mysql-client
    # 配置MySQL,ownCloud需要提前创建用户、数据库和分配权限
    > create database 库名 character set utf8 collate utf8_general_ci;
    > grant ALL on 库名.* 用户名@localhost identified by "密码" # 注意,ownCloud是不允许root用户的,因为权限太多
    # 配置文件权限
    chmod 775 -R owncloud/ # 不要分配777,分配了也不能用
    chown -R www-data:www-data owncloud/
    # 配置php5-fpm
    $ printenv PATH 获取系统环境变量
    vim /etc/php5/fpm/pool.d/www.conf,将下面几行前面的注释去掉
    ;env[HOSTNAME] = $HOSTNAME
    ;env[PATH] = /usr/local/bin:/usr/bin:/bin # 这里还要修改为刚才获取到的环境变量
    ;env[TMP] = /tmp
    ;env[TMPDIR] = /tmp
    ;env[TEMP] = /tmp
    # 配置nginx,按照官网的教程配置Nginx conf:https://doc.owncloud.org/server/7.0/admin_manual/installation/nginx_configuration.html
    对于官网的配置,我做了如下几项修改:
    location ~ .php(?:$|/)$这里面修改为:
    location ~ ^(.+?.php)(/.*)?$ \{
      fastcgi_split_path_info ^(.+.php)(/.+)$;
      include fastcgi_params;
      fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
      fastcgi_param PATH_INFO $fastcgi_path_info;
      fastcgi_pass unix:/var/run/php5-fpm.sock;
      fastcgi_index index.php;
      include fastcgi_params;
      fastcgi_param PHP_VALUE "post_max_size=10G \\n upload_max_filesize=10G"; # 上传默认居然为513MB,这里可以修改大,不然在owncloud无法调整到更大
    \}
    检查配置文件是否正确用# nginx -t nginx.conf

TroubleShooting

1
2
3
## TroubleShooting
- **中文设置**:
sudo raspi-config
去掉en_GB.UTF-8 UTF-8
选择“en_US.UTF-8 UTF-8”、“zh_CN.UTF-8 UTF-8”、“zh_CN.GBK GBK”
然后第二个页面默认语言选择en_GB.UTF-8 UTF-8
1
2
3
4
5
6
7
8
参考:
[http://blog.akarin.xyz/raspberry-init/
https://github.com/ccforward/cc/issues/25?utm_source=tuicool](http://blog.akar
in.xyz/raspberry-
init/https://github.com/ccforward/cc/issues/25?utm_source=tuicool "Link:
http://blog.akarin.xyz/raspberry-
init/https://github.com/ccforward/cc/issues/25?utm_source=tuicool" )
豪翔天下

各种数据库的应用场景

发表于 2015-12-07 | 分类于 编程之路 |
  • Redis

统计:比如行为指标、点击量统计、访问量统计、排行榜、最新或最高的N个数据等
缓存:会话缓存,页面缓存,全局变量缓存
队列:队列服务
过期:需要设置过期时间的数据

  • MongoDB

文本:日志、文章等

豪翔天下

内网穿透方案

发表于 2015-12-04 | 分类于 编程之路 |
背景:天朝大局域网,网上都说可以打电话叫客服切换到公网IP,但是电信、移动宽带,无论打客服还是安装师傅,居然从上到下都不知道公网IP是什么,他们以为我

要公网IP是要独立宽带,公网IP和共享宽带明明是两个概念好不好,你可以封80端口,但是其它什么端口至少给我留一个总行吧,客服没用,就只能自己动手了。

方案一:SSH Tunnel

使用SSH进行的Tunnel进行端口转发,对于不需要访问desktop来说是最简单的一种内网穿透方案,当然,唯一的要求是你得有一个有公网IP的服务器做代理。

  1. 首先,在内网主机上执行ssh命令:
ssh -NfR 外网ssh端口号:localhost:本地ssh端口号 远端IP
  1. 在代理服务器上执行通过ssh连接内网的服务器:
ssh -p 刚才定义的远程端口号  localhost
  1. SSH太容易掉线了,为了不掉线,有多种方法:
    # 首先,修改SSH配置
代理服务器段:vim /etc/ssh/sshd_config,修改或新增如下两项


ClinetAliveInterval 60
ClientAliveCountMax 10
然后重启SSH服务:service sshd restart

内网服务器端:vim /etc/ssh/ssh_config,修改或新增


Host *
ServerAliveInterval 30

# 但其实,这两种方法都还是容易掉线,接下来,终极解决方案  



写一个python脚本,然后nohup keepalived.py &,在脚本里新建一个ssh连接,不断发送空格即可

```
#!/usr/bin/python
#coding: utf-8
import paramiko, time

class myssh():  
    def **init**(self, ip, port, username, password):  
    self.ssh = paramiko.SSHClient()  
    self.ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())  
    self.ssh.connect(ip, port, username, password, timeout=5)  

    def exec(self, cmd):  
        return self.ssh.exec_command(cmd)  

if '__name__ == '__main__':  
    ssh = myssh('localhost', 8022, 'haofly', '896499825')  
    while True:  
        ssh.exec(' ')  
        time.sleep(30)  
    ssh.close()

推荐阅读:
几个内网穿透,内网网站穿透,内网端口映射到公网的服务推荐

豪翔天下

博客再次改版

发表于 2015-12-02 | 分类于 编程之路 |

经过长达5个月的艰辛历程(实际的编程时间少之又少),终于自己打造出了一个静态博客。虽然没有wordpress那么方便,虽然没有预期中那么漂亮,虽然还有很多功
能还没有完善,但是这是我第一次用自己的技术为自己做了一个“会用”的东西。

开博两年多了,从Github的octopress到wordpress再到自己搭建静态博客,写了总共一百多篇文章,由于本次改版采用的是大重构,所以我是一篇文章
一篇文章的迁移过来的,只是格式有些还没改动,不过大体能成型了。目前博客有我很喜欢的几大特点:

1.静态化,直接使用nginx实现html文件的静态访问

2.ajax异步提交评论,评论不会立即出发更新html的接口,而是后台审核后才能更新

3.SEO自己做,这个还有很多学习的地方

4.每篇文章对应一张大图,其实是自己爱上摄影后才选择的这个主题,因为这样不仅让我在每篇文章编写时用心,还能用心拍照片(当然,实在没照片的时候就使用的Pixe
bay的免费可商用的图片)

5.采用我最喜欢的Python语言进行编写(Djanago框架)

总之,这几个月很少更新博客,原因是公司的事情太多,自己没有合理利用好时间,所以导致这种这个局面。其实,最近几个月的实习,自己还是有很多感悟的,以后会陆续写成
文章发表出来的。也不知道有没有人看,反正,开心就好咯。

豪翔天下

Python3 使用MySQL Connector操作数据库

发表于 2015-11-04 | 分类于 编程之路 |

参考文档:
https://dev.mysql.com/doc/connector-python/en/

http://mysql-python.sourceforge.net/MySQLdb.html

安装方法

支持python3的mysql driver有mysqlclient和pymysql,不推荐只支持2的MySQLdb

1
2
3
4
5
6
7
# ubuntu
sudo apt-get install python3-dev libmysqlclient-dev
pip install mysqlclient
# CentOS
sudo yum install pytho3-devel mysql-devel
pip install mysqlclient

数据库的连接

这里有所有的连接参数列表

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 使用Oracle官方提供的数据库引擎的连接方法
import mysql.connector
cnx = mysql.connector.connect(
user='',
password='',
host='',
database='',
pool_size=3 # 连接池大小)
cnx.close()
# 使用基于MySQLdb的连接方法,比如mysqlclient
import MySQLdb
db = MySQLdb.connect(
host='',
user='',
passwd='',
db='',
charset='utf8',
autocommit=True)
cursor = db.cursor()

Difference:两个库的区别

# MySQL Connector/Python
Oracle官方的实现,底层完全用C来实现
默认未开启cursorbuffer,如果需要则显式开启:cnx.cursor(buffered=True)或者mysql.connector.connect(buffered=True),开启了buffer,可同时使用多个游标

# MySQLdb
不完全用C
默认开启了cursor的,会缓存结果,但是针对特别大的查询,可能会导致程序崩溃

CURD操作

插入

# 插入一条数据
insert_stmt = (
  "INSERT INTO employees (emp_no, first_name, last_name, hire_date) "
  "VALUES (%s, %s, %s, %s)"
)
data = (2, 'Jane', 'Doe', datetime.date(2012, 3, 23))
cursor.execute(insert_stmt, data)

# 同时插入多条数据
data = [
    ('a', 'b', 'c', 'd'),
    ('e', 'f', 'g', 'h')
]
stmt = 'INSERT INTO table_name (field_name1, field_name2)'                 'VALUES(%s, %s)'
cursor.executemany(stmt, data)

读取

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# 看了源码发现,fetchone/fatchmany/fetchall实现居然是一样的:https://github.com/PyMySQL/mysqlclient-python/blob/7d289b21728ab1a94bb1f0210a26367c6714d881/MySQLdb/cursors.py,结果都是一次取出保存,这三个方法就是在结果列表里面切片而已
# fetchone
cursor.execute('select * FROM user")
row = cursor.fetchone()
while row is not None:
print(row)
row = cursor.fetchone()
# cursor可以直接拿来做迭代器
cursor.execute(sql)
for row in cursor:
print(row)
# fetchmany():获取固定数量的结果,当然,每次fetch过后指针会偏移到后面那个地方
rows = cursor.fetchmany(size=1)

TroubleShooting

  • 获取insert后的ID值

    1
    db.insert_id() # 表示上一次插入数据的ID
  • 获取原始SQL语句

    1
    print(cursor._last_executed)
  • 多线程的情况下,出现错误”OperationalError:(2013, ‘Lost connection to MySQL server during query’)”,出现这种情况是因为在多线程的情况下,如果只有一个mysql连接,那么mysql该连接会在执行完一个线程后销毁,需要加锁,在线程里面修改全局变量,会导致该变量的引用出错

    1
    2
    3
    4
    5
    LOCK.acquire()
    mysql.cursor.execute(sql)
    result = mysql.cursor.fetchall()
    LOCK.release()
    print(len(result))
  • Can’t connect to local mySQL server ough socket ‘/tmp/mysql.sock

    可能原因是由于MySQL是编译安装的,没有放在默认的目录,导致python找不到默认的sock文件,可以用一个软连接将实际文件链接到这个默认的目录下面

豪翔天下

使用Supervisor管理进程

发表于 2015-08-11 | 分类于 编程之路 |

参考文章:http://segmentfault.com/a/1190000002991175(原文中还有使用OneAPM安装Python探针的应用,可以实时监控web应用数据,暂时还未实践)

supervisor是使用Python编写的进程管理软件,在实际开发中,一般用它来同时开始一批相关的进程,无论是Django的runserver还是直接管理Nginx、Apache等,都比较方便,这里是其使用方法:

安装

# ubuntu
apt-get install supervisor
service supervisor restart

# centos
yum install supervisor
/etc/init.d/supervisord restart


# 如果安装出现unix:///var/run/supervisor.sock no such file这样的错误,那么请参考:http://tuzii.me/diary/522dc528848eea683d7724f2/\%E8\%A7\%A3\%E5\%86\%B3ubuntu-supervisor-unix:var-run-supervisor.sock-no-such-file.\%E7\%9A\%84\%E6\%96\%B9\%E6\%B3\%95




sudo easy_install supervisor
echo_supervisord_conf > supervisord.conf  # 生成一个配置文件
sudo supervisord -c supervisord.conf      # 使用该配置文件启动supervisord
sudo supervisorctl                        # 进入命令行界面管理进程

设置一个进程

# 在supervisord.conf里面添加如下内容
[program:frontend]                                           # 进程名
command=/usr/bin/python manage.py runserver 0.0.0.0:8000     # 启动该进程的命令
directory=/media/sf_company/frontend/frontend                # 在执行上面命令前切换到指定目录
startsecs=0
stopwaitsecs=0
autostart=false
autorestart=false
user=root
stdout_logfile=/root/log/8000_access.log                     # 访问日志
stderr_logfile=/root/log/8000_error.log                      # 错误日志

这样就创建了一个进程,进程的名称为frontend

supervisorctl常用命令:

start name    # 开始一个进程
stop name    # 终止一个进程
status   # 查看当前管理状态
豪翔天下

SSL各种格式证书的转换(JKS to PEM, KEY, CRT)

发表于 2015-08-04 | 分类于 编程之路 |

原文链接:http://ju.outofmemory.cn/entry/108566

fuck,最讨厌java了,有同事说JKS是Java特有的东西,所以必须调用Java才能使用,but,I use Python,nothing is im
possible,发现使用requests来直接调用其它格式的证书文件就行,当然,Python也可以用pyjks包来将jks转换为其它格式,但没必要那样做,
因为直接用ssl工具转换就可以一劳永逸了。

JKS(Java
KeyStore)是Java的一个证书仓库,包括授权整数和公钥整数等。JDK提供了一个工具keytool用于管理keystore。转换步骤:

  1. 使用keytool导出成PKCS12格式:

    keytool -importkeystore -srckeystore server.jks -destkeystore server.p12 -srcstoretype jks -deststoretype pkcs12

    输入目标密钥库口令:

    再次输入新口令:
    输入源密钥库口令:

    已成功导入别名 ca_root 的条目。
    已完成导入命令: 1 个条目成功导入, 0 个条目失败或取消

  2. 生成pem证书(包含了key,server证书和ca证书):

    生成key 加密的pem证书

    $ openssl pkcs12 -in server.p12 -out server.pem
    Enter Import Password:
    MAC verified OK
    Enter PEM pass phrase:
    Verifying - Enter PEM pass phrase:

# 生成key 非加密的pem证书




$ openssl pkcs12 -nodes -in server.p12 -out server.pem
Enter Import Password:
MAC verified OK
  1. 单独导出key:

    生成加密的key

    $ openssl pkcs12 -in tankywoo.p12 -nocerts -out server.key
    Enter Import Password:
    MAC verified OK
    Enter PEM pass phrase:
    Verifying - Enter PEM pass phrase:

# 生成非加密的key




$ openssl pkcs12 -in tankywoo.p12 -nocerts -nodes -out server.key
Enter Import Password:
MAC verified OK
  1. 单独导出server证书:

    $ openssl pkcs12 -in server.p12 -nokeys -clcerts -out server.crt
    Enter Import Password:
    MAC verified OK

  2. 单独导出ca证书:

    $ openssl pkcs12 -in server.p12 -nokeys -cacerts -out ca.crt
    Enter Import Password:
    MAC verified OK

TroubleShooting:

1.至于原文中出现的导入ca_root证书出现错误,它那个方法貌似不管用,这里建议将Java升级到Java8即可成功导入。

2.在Python中使用ssl时(无论是用httplib、ssl还是requests),可能出现以下错误:

Traceback (most recent call last):
  File "client.py", line 10, in <module>
    ssl_sock.connect(('', 9000))
  File "/Users/amk/source/p/python/Lib/ssl.py", line 204, in connect
    self.ca_certs)
ssl.SSLError: [Errno 0] _ssl.c:327: error:00000000:lib(0):func(0):reason(0)

根本原因就是提供的证书是错误的

豪翔天下

[转]函数式编程——CoolShell

发表于 2015-07-25 | 分类于 韦编三绝 |

原文地址:http://coolshell.cn/articles/10822.html

本篇文章写于2013年底,而今天我看来,依然是精华中的精华,就喜欢这种深入浅出的文章,带我们对函数式编程更深入的理解,并且本篇文章采用多种语言多种角度来
向我们讲解了到底什么才是函数式编程,再加上最近工作上很多的问题,才发现,其实公司之前的代码有很多优秀的地方。

原文地址:http://coolshell.cn/articles/10822.html

本篇文章写于2013年底,而今天我看来,依然是精华中的精华,就喜欢这种深入浅出的文章,带我们对函数式编程更深入的理解,并且本篇文章采用多种语言多种角度来
向我们讲解了到底什么才是函数式编程,再加上最近工作上很多的问题,才发现,其实公司之前的代码有很多优秀的地方。

当我们说起函数式编程来说,我们会看到如下函数式编程的长相:

  • 函数式编程的三大特性:
    immutable data 不可变数据:像Clojure一样,默认上变量是不可变的,如果你要改变变量,你需要把变量copy出去修改。这样一来,可以让你的程序少很多Bug。因为,程序中的状态不好维护,在并发的时候更不好维护。(你可以试想一下如果你的程序有个复杂的状态,当以后别人改你代码的时候,是很容易出bug的,在并行中这样的问题就更多了)
    first class functions:这个技术可以让你的函数就像变量一样来使用。也就是说,你的函数可以像变量一样被创建,修改,并当成变量一样传递,返回或是在函数中嵌套函数。这个有点像Javascript的Prototype(参看Javascript的面向对象编程)
    尾递归优化:我们知道递归的害处,那就是如果递归很深的话,stack受不了,并会导致性能大幅度下降。所以,我们使用尾递归优化技术——每次递归时都会重用stack,这样一来能够提升性能,当然,这需要语言或编译器的支持。Python就不支持。

  • 函数式编程的几个技术
    map & reduce :这个技术不用多说了,函数式编程最常见的技术就是对一个集合做Map和Reduce操作。这比起过程式的语言来说,在代码上要更容易阅读。(传统过程式的语言需要使用for/while循环,然后在各种变量中把数据倒过来倒过去的)这个很像C++中的STL中的foreach,find_if,count_if之流的函数的玩法。
    pipeline:这个技术的意思是,把函数实例成一个一个的action,然后,把一组action放到一个数组或是列表中,然后把数据传给这个action list,数据就像一个pipeline一样顺序地被各个函数所操作,最终得到我们想要的结果。
    recursing 递归 :递归最大的好处就简化代码,他可以把一个复杂的问题用很简单的代码描述出来。注意:递归的精髓是描述问题,而这正是函数式编程的精髓。
    currying:把一个函数的多个参数分解成多个函数, 然后把函数多层封装起来,每层函数都返回一个函数去接收下一个参数这样,可以简化函数的多个参数。在C++中,这个很像STL中的bind_1st或是bind2nd。
    higher order function 高阶函数:所谓高阶函数就是函数当参数,把传入的函数做一个封装,然后返回这个封装函数。现象上就是函数传进传出,就像面向对象对象满天飞一样。

  • 还有函数式的一些好处
    parallelization 并行:所谓并行的意思就是在并行环境下,各个线程之间不需要同步或互斥。lazy evaluation 惰性求值:这个需要编译器的支持。表达式不在它被绑定到变量之后就立即求值,而是在该值被取用的时候求值,也就是说,语句如x:=expression; (把一个表达式的结果赋值给一个变量)明显的调用这个表达式被计算并把结果放置到 x 中,但是先不管实际在 x 中的是什么,直到通过后面的表达式中到 x 的引用而有了对它的值的需求的时候,而后面表达式自身的求值也可以被延迟,最终为了生成让外界看到的某个符号而计算这个快速增长的依赖树。determinism 确定性:所谓确定性的意思就是像数学那样 f(x) = y ,这个函数无论在什么场景下,都会得到同样的结果,这个我们称之为函数的确定性。而不是像程序中的很多函数那样,同一个参数,却会在不同的场景下计算出不同的结果。所谓不同的场景的意思就是我们的函数会根据一些运行中的状态信息的不同而发生变化。

上面的那些东西太抽象了,还是让我们来循序渐近地看一些例子吧。

我们先用一个最简单的例子来说明一下什么是函数式编程。

先看一个非函数式的例子:

1

2

3

4

|

int cnt;

void increment(){

cnt++;

}

—|—

那么,函数式的应该怎么写呢?

1

2

3

|

int increment(int cnt){

return cnt+1;

}

—|—

你可能会觉得这个例子太普通了。是的,这个例子就是函数式编程的准则:不依赖于外部的数据,而且也不改变外部数据的值,而是返回一个新的值给你。

我们再来看一个简单例子:

1

2

3

4

5

6

7

8

9

10

|

def inc(x):

def incx(y):

return x+y

return incx

inc2 = inc(2)

inc5 = inc(5)

print inc2(5) # 输出 7

print inc5(5) # 输出 10

—|—

我们可以看到上面那个例子inc()函数返回了另一个函数incx(),于是我们可以用inc()函数来构造各种版本的inc函数,比如:inc2()和inc5()
。这个技术其实就是上面所说的Currying技术。从这个技术上,你可能体会到函数式编程的理念:把函数当成变量来用,关注于描述问题而不是怎么实现,这样
可以让代码更易读。

Map & Reduce

在函数式编程中,我们不应该用循环迭代的方式,我们应该用更为高级的方法,如下所示的Python代码

1

2

3

|

name_len = map(len, [“hao”, “chen”, “coolshell”])

print name_len

输出 [3, 4, 9]

—|—

你可以看到这样的代码很易读,因为,这样的代码是在描述要干什么,而不是怎么干。

我们再来看一个Python代码的例子:

1

2

3

4

5

6

|

def toUpper(item):

return item.upper()

upper_name = map(toUpper, [“hao”, “chen”, “coolshell”])

print upper_name

输出 [‘HAO’, ‘CHEN’, ‘COOLSHELL’]

—|—

顺便说一下,上面的例子个是不是和我们的STL的transform有些像?

1

2

3

4

5

6

7

8

9

10

11

12

|

#include

#include

#include

using namespace std;

int main() {

string s=”hello”;

string out;

transform(s.begin(), s.end(), back_inserter(out), ::toupper);

cout << out << endl;

// 输出:HELLO

}

—|—

在上面Python的那个例子中我们可以看到,我们写义了一个函数toUpper,这个函数没有改变传进来的值,只是把传进来的值做个简单的操作,然后返回。然后,我
们把其用在map函数中,就可以很清楚地描述出我们想要干什么。而不会去理解一个在循环中的怎么实现的代码,最终在读了很多循环的逻辑后才发现原来是这个或那个意思。
下面,我们看看描述实现方法的过程式编程是怎么玩的(看上去是不是不如函数式的清晰?):

1

2

3

4

|

upname =[‘HAO’, ‘CHEN’, ‘COOLSHELL’]

lowname =[]

for i in range(len(upname)):

lowname.append( upname[i].lower() )

—|—

对于map我们别忘了lambda表达式:你可以简单地理解为这是一个inline的匿名函数。下面的lambda表达式相当于:def func(x):
return x*x

1

2

3

|

squares = map(lambda x: x * x, range(9))

print squares

输出 [0, 1, 4, 9, 16, 25, 36, 49, 64]

—|—

我们再来看看reduce怎么玩?(下面的lambda表达式中有两个参数,也就是说每次从列表中取两个值,计算结果后把这个值再放回去,下面的表达式相当于:(((
(1+2)+3)+4)+5) )

1

2

|

print reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])

输出 15

—|—

Python中的除了map和reduce外,还有一些别的如filter, find, all,
any的函数做辅助(其它函数式的语言也有),可以让你的代码更简洁,更易读。 我们再来看一个比较复杂的例子:

计算数组中正数的平均值

1

2

3

4

5

6

7

8

9

10

11

12

13

|

num =[2, -5, 9, 7, -2, 5, 3, 1, 0, -3, 8]

positive_num_cnt = 0

positive_num_sum = 0

for i in range(len(num)):

if num[i] > 0:

positive_num_cnt += 1

positive_num_sum += num[i]

if positive_num_cnt > 0:

average = positive_num_sum / positive_num_cnt

print average

输出 5

—|—

如果用函数式编程,这个例子可以写成这样:

1

2

|

positive_num = filter(lambda x: x>0, num)

average = reduce(lambda x,y: x+y, positive_num) / len( positive_num )

—|—

C++11玩的法:

1

2

3

4

5

6

7

8

9

10

11

12

|

#include

#include

#include

#include

#include

using namespace std;

vector num {2, -5, 9, 7, -2, 5, 3, 1, 0, -3, 8};

vector p_num;

copy_if(num.begin(), num.end(<span class=”crayon-sy” sty

豪翔天下

[转]Apache vs Nginx: Practical Considerations

发表于 2015-07-18 | 分类于 编程之路 |

原文地址:https://www.digitalocean.com/community/tutorials/apache-vs-nginx- practical-considerations

总之就是各有各的优点,最好的方式就是nginx在前面做反向代理,并顺便处理静态内容,而apache则负责处理动态内容。

Introduction

Apache and Nginx are the two most common open source web servers in the world.
Together, they are responsible for serving over 50\% of traffic on the
internet. Both solutions are capable of handling diverse workloads and working
with other software to provide a complete web stack.

While Apache and Nginx share many qualities, they should not be thought of as
entirely interchangeable. Each excels in its own way and it is important to
understand the situations where you may need to reevaluate your web server of
choice. This article will be devoted to a discussion of how each server stacks
up in various areas.

General Overview

Before we dive into the differences between Apache and Nginx, let’s take a
quick look at the background of these two projects and their general
characteristics.

Apache

The Apache HTTP Server was created by Robert McCool in 1995 and has been
developed under the direction of the Apache Software Foundation since 1999.
Since the HTTP web server is the foundation’s original project and is by far
their most popular piece of software, it is often referred to simply as
“Apache”.

The Apache web server has been the most popular server on the internet since

  1. Because of this popularity, Apache benefits from great documentation and
    integrated support from other software projects.

Apache is often chosen by administrators for its flexibility, power, and
widespread support. It is extensible through a dynamically loadable module
system and can process a large number of interpreted languages without
connecting out to separate software.

Nginx

In 2002, Igor Sysoev began work on Nginx as an answer to the C10K problem,
which was a challenge for web servers to begin handling ten thousand
concurrent connections as a requirement for the modern web. The initial public
release was made in 2004, meeting this goal by relying on an asynchronous(异步),
events-driven(事件驱动) architecture.

Nginx has grown in popularity since its release due to its light-weight
resource utilization and its ability to scale easily on minimal hardware.
Nginx excels at serving static content(静态内容) quickly and is designed to pass
dynamic requests off to other software that is better suited for those
purposes.

Nginx is often selected by administrators for its resource efficiency and
responsiveness under load. Advocates welcome Nginx’s focus on core web server
and proxy features.

Connection Handling Architecturel(连接处理架构)

One big difference between Apache and Nginx is the actual way that they handle
connections and traffic. This provides perhaps the most significant difference
in the way that they respond to different traffic conditions.

Apache

Apache provides a variety of multi-processing modules (Apache calls these
MPMs) that dictate(决定) how client requests are handled. Basically, this allows
administrators to swap out its connection handling architecture easily. These
are:

  • mpm_prefork: This processing module spawns(产生) processes with a single thread each to handle request(一个请求一个线程). Each child can handle a single connection at a time. As long as the number of requests is fewer than the number of processes, this MPM is very fast(请求数量比进程数量少的时候会很快). However, performance degrades quickly after the requests surpass the number of processes, so this is not a good choice in many scenarios. Each process has a significant impact on RAM consumption, so this MPM is difficult to scale effectively. This may still be a good choice though if used in conjunction with other components that are not built with threads in mind. For instance, PHP is not thread-safe, so this MPM is recommended as the only safe way of working with mod_php, the Apache module for processing these files.
  • mpm_worker: This module spawns processes that can each manage multiple threads(每个进程可以管理多个线程). Each of these threads can handle a single connection. Threads are much more efficient than processes(线程比进程更高效), which means that this MPM scales better than the prefork MPM. Since there are more threads than processes, this also means that new connections can immediately take a free thread instead of having to wait for a free process(处理新的连接只需要有新的线程而不需要等待进程释放).
  • mpm_event: This module is similar to the worker module in most situations, but is optimized to handle keep-alive connections(对持久连接进行了优化). When using the worker MPM, a connection will hold a thread regardless of whether a request is actively being made for as long as the connection is kept alive. The event MPM handles keep alive connections by setting aside dedicated(专用的) threads for handling keep alive connections and passing active requests off to other threads. This keeps the module from getting bogged down by keep-alive requests, allowing for faster execution. This was marked stable with the release of Apache 2.4. As you can see, Apache provides a flexible architecture for choosing different connection and request handling algorithms. The choices provided are mainly a function of the server’s evolution and the increasing need for concurrency as the internet landscape has changed.

Nginx

Nginx came onto the scene after Apache, with more awareness of the concurrency
problems that would face sites at scale(更注重并发问题). Leveraging(利用) this
knowledge, Nginx was designed from the ground up to use an asynchronous, non-
blocking, event-driven connection handling algorithm.

Nginx spawns worker processes, each of which can handle thousands of
connections(产生的是worker进程,每个都可以处理上千个连接). The worker processes accomplish this
by implementing a fast looping mechanism(快速的循环机制) that continuously checks for
and processes events. Decoupling(解耦) actual work from connections allows each
worker to concern itself with a connection only when a new event has been
triggered.

Each of the connections handled by the worker are placed within the event loop
where they exist with other connections. Within the loop, events are processed
asynchronously, allowing work to be handled in a non-blocking manner. When the
connection closes, it is removed from the loop.

This style of connection processing allows Nginx to scale incredibly far with
limited resources. Since the server is single-threaded and processes are not
spawned to handle each new connection, the memory and CPU usage tends to stay
relatively consistent, even at times of heavy load.

Static vs Dynamic Content

In terms of real world use-cases, one of the most common comparisons between
Apache and Nginx is the way in which each server handles requests for static
and dynamic content.

Apache

Apache servers can handle static content using its conventional file-based
methods. The performance of these operations is mainly a function of the MPM
methods described above.

Apache can also process dynamic content by embedding(嵌入) a processor of the
language in question into each of its worker instances. This allows it to
execute dynamic content within the web server itself without having to rely on
external components. These dynamic processors can be enabled through the use
of dynamically loadable modules.

Apache’s ability to handle dynamic content internally means that configuration
of dynamic processing tends to be simpler. Communication does not need to be
coordinated with an additional piece of software and modules can easily be
swapped out if the content requirements change.

Nginx

Nginx does not have any ability to process dynamic content
natively(本身无法处理动态内容). To handle PHP and other requests for dynamic content,
Nginx must pass to an external processor for execution and wait for the
rendered content to be sent back(必须使用外部的执行程序然后等待返回). The results can then be
relayed to the client.

For administrators, this means that communication must be configured between
Nginx and the processor over one of the protocols Nginx knows how to speak
(http, FastCGI, SCGI, uWSGI, memcache). This can complicate things slightly,
especially when trying to anticipate the number of connections to allow, as an
additional connection will be used for each call to the processor.

However, this method has some advantages as well. Since the dynamic
interpreter is not embedded in the worker process, its overhead will only be
present for dynamic content. Static content can be served in a straight-
forward manner and the interpreter will only be contacted when needed. Apache
can also function in this manner, but doing so removes the benefits in the
previous section.

Distributed vs Centralized Configuration

For administrators, one of the most readily apparent differences between these
two pieces of software is whether directory-level configuration is permitted
within the content directories.

Apache

Apache includes an option to allow additional configuration on a per-directory
basis by inspecting and interpreting directives in hidden files within the
content directories themselves. These files are known as .htaccess
files.(可以管理每个目录)

Since these files reside within the content directories themselves, when
handling a request, Apache checks each component of the path to the requested
file for an .htaccess file and applies the directives found within. This
effectively allows decentralized configuration of the web server, which is
often used for implementing URL rewrites, access restrictions, authorization
and authentication, even caching policies.

While the above examples can all be configured in the main Apache
configuration file,.htaccess files have some important advantages. First,
since these are interpreted each time they are found along a request path,
they are implemented immediately without reloading the server. Second, it
makes it possible to allow non-privileged users to control certain aspects of
their own web content without giving them control over the entire
configuration file.

This provides an easy way for certain web software, like content management
systems, to configure their environment without providing access to the
central configuration file. This is also used by shared hosting providers to
retain control of the main configuration while giving clients control over
their specific directories.

Nginx

Nginx does not interpret .htaccess files, nor does it provide any mechanism
for evaluating per-directory configuration outside of the main configuration
file. This may be less flexible than the Apache model, but it does have its
own advantages.

The most notable improvement over the .htaccess system of directory-level
configuration is increased performance. For a typical Apache setup that may
allow .htaccess in any directory, the server will check for these files in
each of the parent directories leading up to the requested file, for each
request. If one or more .htaccess files are found during this search, they
must be read and interpreted. By not allowing directory overrides, Nginx can
serve requests faster by doing a single directory lookup and file read for
each request (assuming that the file is found in the conventional directory
structure).

Another advantage is security related(安全). Distributing directory-level
configuration access also distributes the responsibility of security to
individual users, who may not be trusted to handle this task well. Ensuring
that the administrator maintains control over the entire web server can
prevent some security missteps that may occur when access is given to other
parties.

Keep in mind that it is possible to turn off .htaccess interpretation in
Apache if these concerns resonate with you.

File vs URI-Based Interpretation

How the web server interprets requests and maps them to actual resources on
the system is another area where these two servers differ.

Apache

Apache provides the ability to interpret a request as a physical resource on
the filesystem or as a URI location that may need a more abstract evaluation.
In general, for the former Apache uses or blocks, while it
utilizes blocks for more abstract resources.

Because Apache was designed from the ground up as a web server, the default is
usually to interpret requests as filesystem resources. It begins by taking the
document root and appending the portion of the request following the host and
port number to try to find an actual file. Basically, the filesystem hierarchy
is represented on the web as the available document tree.

Apache provides a number of alternatives for when the request does not match
the underlying filesystem. For instance, an Alias directive can be used to map
to an alternative location. Using blocks is a method of working
with the URI itself instead of the filesystem. There are also regular
expression variants which can be used to apply configuration more flexibly
throughout the filesystem.

While Apache has the ability to operate on both the underlying filesystem and
the webspace, it leans heavily towards filesystem methods. This can be seen in
some of the design decisions, including the use of .htaccess files for per-
directory configuration. The Apache docs themselves warn against using URI-
based blocks to restrict access when the request mirrors the underlying
filesystem.

Nginx

Nginx was created to be both a web server and a proxy server. Due to the
architecture required for these two roles, it works primarily with URIs,
translating to the filesystem when necessary.

This can be seen in some of the ways that Nginx configuration files are
constructed and interpreted.Nginx does not provide a mechanism for specifying
configuration for a filesystem directory and instead parses the URI itself.

For instance, the primary configuration blocks for Nginx are server and
location blocks. The server block interprets the host being requested, while
the location blocks are responsible for matching portions of the URI that
comes after the host and port. At this point, the request is being interpreted
as a URI, not as a location on the filesystem.

For static files, all requests eventually have to be mapped to a location on
the filesystem. First, Nginx selects the server and location blocks that will
handle the request and then combines the document root with the URI, adapting
anything necessary according to the configuration specified.

This may seem similar, but parsing requests primarily as URIs instead of
filesystem locations allows Nginx to more easily function in both web, mail,
and proxy server roles. Nginx is configured simply by laying out how to
respond to different request patterns. Nginx does not check the filesystem
until it is ready to serve the request, which explains why it does not
implement a form of .htaccess files.

Modules

Both Nginx and Apache are extensible through module systems, but the way that
they work differ significantly.

Apache

Apache’s module system allows you to dynamically load or unload modules to
satisfy your needs during the course of running the server. The Apache core is
always present, while modules can be turned on or off, adding or removing
additional functionality and hooking into the main server.

Apache uses this functionality for a large variety tasks. Due to the maturity
of the platform, there is an extensive library of modules available. These can
be used to alter some of the core functionality of the server, such as
mod_php, which embeds a PHP interpreter into each running worker.

Modules are not limited to processing dynamic content, however. Among other
functions, they can be used for rewriting URLs, authenticating clients,
hardening the server, logging, caching, compression, proxying, rate limiting,
and encrypting. Dynamic modules can extend the core functionality considerably
without much additional work.

Nginx

Nginx also implements a module system, but it is quite different from the
Apache system. In Nginx, modules are not dynamically loadable, so they must be
selected and compiled into the core software(模块居然不是动态加载的).

For many users, this will make Nginx much less flexible. This is especially
true for users who are not comfortable maintaining their own compiled software
outside of their distribution’s conventional packaging system. While
distributions’ packages tend to include the most commonly used modules, if you
require a non-standard module, you will have to build the server from source
yourself.

Nginx modules are still very useful though, and they allow you to dictate what
you want out of your server by only including the functionality you intend to
use. Some users also may consider this more secure, as arbitrary components
cannot be hooked into the server. However, if your server is ever put in a
position where this is possible, it is likely compromised already.

Nginx modules allow many of the same capabilities as Apache modules. For
instance, Nginx modules can provide proxying support, compression, rate
limiting, logging, rewriting, geolocation, authentication, encryption,
streaming, and mail functionality.

Support, Compatibility, Ecosystem, and Documentation

A major point to consider is what the actual process of getting up and running
will be given the landscape of available help and support among other
software.

Apache

Because Apache has been popular for so long, support for the server is fairly
ubiquitous. There is a large library of first- and third-party documentation
available for the core server and for task-based scenarios involving hooking
Apache up with other software.

Along with documentation, many tools and web projects include tools to
bootstrap themselves within an Apache environment. This may be included in the
projects themselves, or in the packages maintained by your distribution’s
packaging team.

Apache, in general, will have more support from third-party projects simply
because of its market share and the length of time it has been available.
Administrators are also somewhat more likely to have experience working with
Apache not only due to its prevalence, but also because many people start off
in shared-hosting scenarios which almost exclusively rely on Apache due to the
.htaccess distributed management capabilities.

Nginx

Nginx is experiencing increased support as more users adopt it for its
performance profile, but it still has some catching up to do in some key
areas.

In the past, it was difficult to find comprehensive English-language
documentation regarding Nginx due to the fact that most of the early
development and documentation were in Russian. As interest in the project
grew, the documentation has been filled out and there are now plenty of
administration resources on the Nginx site and through third parties.

In regards to third-party applications, support and documentation is becoming
more readily available, and package maintainers are beginning, in some cases,
to give choices between auto-configuring for Apache and Nginx. Even without
support, configuring Nginx to work with alternative software is usually
straight-forward so long as the project itself documents its requirements
(permissions, headers, etc).

Using Apache and Nginx Together

After going over the benefits and limitations of both Apache and Nginx, you
may have a better idea of which server is more suited to your needs. However,
many users find that it is possible to leverage each server’s strengths by
using them together.

The conventional configuration for this partnership is to place Nginx in front
of Apache as a reverse proxy(Nginx做反向代理). This will allow Nginx to handle all
requests from clients. This takes advantage of Nginx’s fast processing speed
and ability to handle large numbers of connections concurrently.

For static content, which Nginx excels at, the files will be served quickly
and directly to the client. For dynamic content, for instance PHP files, Nginx
will proxy the request to Apache, which can then process the results and
return the rendered page. Nginx can then pass the content back to the client.

This setup works well for many people because it allows Nginx to function as a
sorting machine. It will handle all requests it can and pass on the ones that
it has no native ability to serve. By cutting down on the requests the Apache
server is asked to handle, we can alleviate some of the blocking that occurs
when an Apache process or thread is occupied.

This configuration also allows you to scale out by adding additional backend
servers as necessary. Nginx can be configured to pass to a pool of servers
easily, increasing this configuration’s resilience to failure and performance.

Conclusion

As you can see, both Apache and Nginx are powerful, flexible, and capable.
Deciding which server is best for you is largely a function of evaluating your
specific requirements and testing with the patterns that you expect to see.

There are differences between these projects that have a very real impact on
the raw performance, capabilities, and the implementation time necessary to
get each solution up and running. However, these usually are the result of a
series of trade offs that should not be casually dismissed. In the end, there
is no one-size-fits-all web server, so use the solution that best aligns with
your objectives.

1…345…15
haofly

haofly

豪翔天下的个人博客

147 日志
6 分类
RSS
GitHub 微博
小伙伴们
  • Phodal
  • zkzhao
  • 倾国倾城的博客
© 2016 haofly
由 Hexo 强力驱动
主题 - NexT.Pisces